Continuous Multi-Armed Bandits and Multiparameter Processes
Mandelbaum, Avi
Ann. Probab., Tome 15 (1987) no. 4, p. 1527-1556 / Harvested from Project Euclid
A general framework is proposed for continuous time dynamic allocation models of a scarce resource among competing projects. The allocation model is formulated as a multi-armed bandit model and solved as a control problem of a multiparameter process. In contrast to discrete time bandits, where only one arm can be pulled at a time, the continuous time bandit must allow simultaneous pulls. The multiparameter approach allows a strong solution of diffusion-type bandits. Here the main problem is to define precisely how to switch among arms and the solution involves local times.
Publié le : 1987-10-14
Classification:  Multi-armed bandits,  dynamic allocation,  Gittins' index,  multiparameter processes,  diffusions,  local time,  optional increasing path,  stochastic control,  62L99,  93E20,  60J60,  60K10,  60G17,  60J55
@article{1176991992,
     author = {Mandelbaum, Avi},
     title = {Continuous Multi-Armed Bandits and Multiparameter Processes},
     journal = {Ann. Probab.},
     volume = {15},
     number = {4},
     year = {1987},
     pages = { 1527-1556},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176991992}
}
Mandelbaum, Avi. Continuous Multi-Armed Bandits and Multiparameter Processes. Ann. Probab., Tome 15 (1987) no. 4, pp.  1527-1556. http://gdmltest.u-ga.fr/item/1176991992/