A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit
Joshi, V. M.
Ann. Statist., Tome 3 (1975) no. 1, p. 189-202 / Harvested from Project Euclid
Two independent Bernoulli processes (arms) have unknown success probabilities $\rho$ and $\lambda$. The initial (a priori) information about $\rho$ and $\lambda$ is expressed by probability distributions $dR(\rho) = C_R \rho{^r_0}(1 - \rho)^{r_0'} d\mu(\rho) \text{for the right arm},$ and $dL(\lambda) = C_L \lambda^{l_0}(1 - \lambda)^{l_0'} d\mu(\lambda) \text{for the left arm},$ where $\mu$ is any arbitrary measure on the unit interval. A specified number $n$ of observations is made sequentially, the arm selected at each stage depending on the previous observations and the initial information. A conjecture of Berry states that if the initial information present about the right arm (given by $r_0 + r_0'$) is not greater than that present for the left arm $(l_0 + l_0')$ and the initial expected value of $\rho$ is not less than that of $\lambda$, then for any $n$ the advantage (in terms of expected number of successes) of taking the first observation on the right arm is never less than that for the left arm. A proof of this conjecture is given in this paper.
Publié le : 1975-01-14
Classification:  Bernoulli two-armed bandit,  prior distributions,  expected advantage,  Bernoulli parameters
@article{1176343007,
     author = {Joshi, V. M.},
     title = {A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit},
     journal = {Ann. Statist.},
     volume = {3},
     number = {1},
     year = {1975},
     pages = { 189-202},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176343007}
}
Joshi, V. M. A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit. Ann. Statist., Tome 3 (1975) no. 1, pp.  189-202. http://gdmltest.u-ga.fr/item/1176343007/