Multi-armed Bandit, Dynamic Environments and Meta-BanditsReport as inadecuate

Multi-armed Bandit, Dynamic Environments and Meta-Bandits - Download this document for free, or read online. Document in PDF available to download.

1 LRI - Laboratoire de Recherche en Informatique 2 TANC - Algorithmic number theory for cryptology LIX - Laboratoire d-informatique de l-École polytechnique Palaiseau, Inria Saclay - Ile de France, Polytechnique - X, CNRS - Centre National de la Recherche Scientifique : UMR7161

Abstract : This paper presents the Adapt-EvE algorithm, extending the UCBT online learning algorithm Auer et al. 2002 to abruptly changing environments. Adapt-EvE features an adaptive change-point detection test based on Page-Hinkley statistics, and two alternative xtra-exploration procedures respectively based on smooth-restart and Meta-Bandits.

Keywords : multi-armed bandit statistical learning ucb

Author: Cédric Hartland - Sylvain Gelly - Nicolas Baskiotis - Olivier Teytaud - Michèle Sebag -



Related documents