How to gamble with non-stationary X-armed bandits and have no regrets

dc.bibliographicCitation.seriesTitleWIAS Preprintseng
dc.bibliographicCitation.volume2686
dc.contributor.authorAvanesov, Valeriy
dc.date.accessioned2022-06-30T12:42:34Z
dc.date.available2022-06-30T12:42:34Z
dc.date.issued2020
dc.description.abstractIn X-armed bandit problem an agent sequentially interacts with environment which yields a reward based on the vector input the agent provides. The agent's goal is to maximise the sum of these rewards across some number of time steps. The problem and its variations have been a subject of numerous studies, suggesting sub-linear and sometimes optimal strategies. The given paper introduces a new variation of the problem. We consider an environment, which can abruptly change its behaviour an unknown number of times. To that end we propose a novel strategy and prove it attains sub-linear cumulative regret. Moreover, the obtained regret bound matches the best known bound for GP-UCB for a stationary case, and approaches the minimax lower bound in case of highly smooth relation between an action and the corresponding reward. The theoretical result is supported by experimental study.eng
dc.description.versionpublishedVersioneng
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/9336
dc.identifier.urihttps://doi.org/10.34657/8374
dc.language.isoeng
dc.publisherBerlin : Weierstraß-Institut für Angewandte Analysis und Stochastik
dc.relation.doihttps://doi.org/10.20347/WIAS.PREPRINT.2686
dc.relation.issn2198-5855
dc.rights.licenseThis document may be downloaded, read, stored and printed for your own use within the limits of § 53 UrhG but it may not be distributed via the internet or passed on to external parties.eng
dc.rights.licenseDieses Dokument darf im Rahmen von § 53 UrhG zum eigenen Gebrauch kostenfrei heruntergeladen, gelesen, gespeichert und ausgedruckt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden.ger
dc.subject.ddc510
dc.subject.otherBootstrapeng
dc.subject.otherchange point detectioneng
dc.subject.othernonparametricseng
dc.subject.otherregressioneng
dc.subject.othermultiscaleeng
dc.titleHow to gamble with non-stationary X-armed bandits and have no regretseng
dc.typeReporteng
dc.typeTexteng
dcterms.extent18 S.
tib.accessRightsopenAccess
wgl.contributorWIAS
wgl.subjectMathematik
wgl.typeReport / Forschungsbericht / Arbeitspapier
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
wias_preprints_2686.pdf
Size:
299.76 KB
Format:
Adobe Portable Document Format
Description: