How to gamble with non-stationary X-armed bandits and have no regrets
dc.bibliographicCitation.seriesTitle | WIAS Preprints | eng |
dc.bibliographicCitation.volume | 2686 | |
dc.contributor.author | Avanesov, Valeriy | |
dc.date.accessioned | 2022-06-30T12:42:34Z | |
dc.date.available | 2022-06-30T12:42:34Z | |
dc.date.issued | 2020 | |
dc.description.abstract | In X-armed bandit problem an agent sequentially interacts with environment which yields a reward based on the vector input the agent provides. The agent's goal is to maximise the sum of these rewards across some number of time steps. The problem and its variations have been a subject of numerous studies, suggesting sub-linear and sometimes optimal strategies. The given paper introduces a new variation of the problem. We consider an environment, which can abruptly change its behaviour an unknown number of times. To that end we propose a novel strategy and prove it attains sub-linear cumulative regret. Moreover, the obtained regret bound matches the best known bound for GP-UCB for a stationary case, and approaches the minimax lower bound in case of highly smooth relation between an action and the corresponding reward. The theoretical result is supported by experimental study. | eng |
dc.description.version | publishedVersion | eng |
dc.identifier.uri | https://oa.tib.eu/renate/handle/123456789/9336 | |
dc.identifier.uri | https://doi.org/10.34657/8374 | |
dc.language.iso | eng | |
dc.publisher | Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik | |
dc.relation.doi | https://doi.org/10.20347/WIAS.PREPRINT.2686 | |
dc.relation.issn | 2198-5855 | |
dc.rights.license | This document may be downloaded, read, stored and printed for your own use within the limits of § 53 UrhG but it may not be distributed via the internet or passed on to external parties. | eng |
dc.rights.license | Dieses Dokument darf im Rahmen von § 53 UrhG zum eigenen Gebrauch kostenfrei heruntergeladen, gelesen, gespeichert und ausgedruckt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden. | ger |
dc.subject.ddc | 510 | |
dc.subject.other | Bootstrap | eng |
dc.subject.other | change point detection | eng |
dc.subject.other | nonparametrics | eng |
dc.subject.other | regression | eng |
dc.subject.other | multiscale | eng |
dc.title | How to gamble with non-stationary X-armed bandits and have no regrets | eng |
dc.type | Report | eng |
dc.type | Text | eng |
dcterms.extent | 18 S. | |
tib.accessRights | openAccess | |
wgl.contributor | WIAS | |
wgl.subject | Mathematik | |
wgl.type | Report / Forschungsbericht / Arbeitspapier |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- wias_preprints_2686.pdf
- Size:
- 299.76 KB
- Format:
- Adobe Portable Document Format
- Description: