A general framework is proposed for continuous time dynamic allocation models of a scarce resource among competing projects. The allocation model is formulated as a multi-armed bandit model and solved as a control problem of a multiparameter process. In contrast to discrete time bandits, where only one arm can be pulled at a time, the continuous time bandit must allow simultaneous pulls. The multiparameter approach allows a strong solution of diffusion-type bandits. Here the main problem is to define precisely how to switch among arms and the solution involves local times.
Publié le : 1987-10-14
Classification:
Multi-armed bandits,
dynamic allocation,
Gittins' index,
multiparameter processes,
diffusions,
local time,
optional increasing path,
stochastic control,
62L99,
93E20,
60J60,
60K10,
60G17,
60J55
@article{1176991992,
author = {Mandelbaum, Avi},
title = {Continuous Multi-Armed Bandits and Multiparameter Processes},
journal = {Ann. Probab.},
volume = {15},
number = {4},
year = {1987},
pages = { 1527-1556},
language = {en},
url = {http://dml.mathdoc.fr/item/1176991992}
}
Mandelbaum, Avi. Continuous Multi-Armed Bandits and Multiparameter Processes. Ann. Probab., Tome 15 (1987) no. 4, pp. 1527-1556. http://gdmltest.u-ga.fr/item/1176991992/