Bandit problems with infinitely many arms
Berry, Donald A. ; Chen, Robert W. ; Zame, Alan ; Heath, David C. ; Shepp, Larry A.
Ann. Statist., Tome 25 (1997) no. 6, p. 2103-2116 / Harvested from Project Euclid
We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with $n \to \infty$. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between $\sqrt{2}/\sqrt{n}$ and $2/\sqrt{n}$ and we exhibit classes of strategies that achieve the latter.
Publié le : 1997-10-14
Classification:  Bandit problems,  sequential experimentation,  dynamic allocation of Bernoulli processes,  staying with a winner,  switching with a loser,  62C25,  62L05,  60F99
@article{1069362389,
     author = {Berry, Donald A. and Chen, Robert W. and Zame, Alan and Heath, David C. and Shepp, Larry A.},
     title = {Bandit problems with infinitely many arms},
     journal = {Ann. Statist.},
     volume = {25},
     number = {6},
     year = {1997},
     pages = { 2103-2116},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1069362389}
}
Berry, Donald A.; Chen, Robert W.; Zame, Alan; Heath, David C.; Shepp, Larry A. Bandit problems with infinitely many arms. Ann. Statist., Tome 25 (1997) no. 6, pp.  2103-2116. http://gdmltest.u-ga.fr/item/1069362389/