Two-Stage Bandits
Clayton, Murray K. ; Witmer, Jeffrey A.
Ann. Statist., Tome 16 (1988) no. 1, p. 887-894 / Harvested from Project Euclid
Two stochastic processes, or "arms," that yield dichotomous responses are available for use in a two-stage decision problem. During the first stage, arms are chosen sequentially; the resulting observations are discounted by a fixed value $\beta$. A single arm must be used in the second stage, in which observations are not discounted. The decision to end the first stage is based on the data obtained. Optimal strategies are considered in the presence of the random discount sequence that arises in this setting. This extends the work of Berry and Fristedt (1979).
Publié le : 1988-06-14
Classification:  Two-stage bandit,  sequential decisions,  regular discounting,  random discounting,  62C10
@article{1176350841,
     author = {Clayton, Murray K. and Witmer, Jeffrey A.},
     title = {Two-Stage Bandits},
     journal = {Ann. Statist.},
     volume = {16},
     number = {1},
     year = {1988},
     pages = { 887-894},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176350841}
}
Clayton, Murray K.; Witmer, Jeffrey A. Two-Stage Bandits. Ann. Statist., Tome 16 (1988) no. 1, pp.  887-894. http://gdmltest.u-ga.fr/item/1176350841/