Sequential Selection of Experiments
Gray, K. B.
Ann. Math. Statist., Tome 39 (1968) no. 6, p. 1953-1977 / Harvested from Project Euclid
The problem of sequential selection of experiments, with fixed and optional stopping, is considered. Conditions are given which allow selection, stopping and terminal action rules to be based on a sequence $\{T_j\}$ of statistics, where $T_j$ is a function of past observations $\mathbf{X}^j = (X_1, \cdots, X_j)$ and experiment selections $\mathbf{E}^j = (E_1, \cdots, E_j)$. Randomized stopping, selection, and terminal action rules are allowed, and all probability distributions are defined by densities relative to $\sigma$-finite measures over Euclidean spaces. Here we give a heuristic description of the principal results for the case of optional stopping. At each time $j$ the random variable $X_j$ is observed and a decision is made to stop or continue. If the procedure is stopped, a terminal action $A$ is taken. If it is continued, an experiment $E_{j+1}$, to be performed at time $j + 1$, is chosen. At time $j$, all decisions are based on $\mathbf{X}^j,\mathbf{E}^j$, the past observations and experiment selections. Upon stopping, and taking action $A$, a loss $L(\theta, A)$, where $\theta$ is the unknown state of nature, is incurred. The sampling cost of stopping at $j$ is $C_j(\theta, \mathbf{X}^j, \mathbf{E}^j)$. Let the random variable $N$ denote the random stopping time. A selection rule $\gamma = (\gamma_0, \gamma_1, \cdots)$ is defined by the sequence of conditional densities $\gamma_j(e_{j+1}\mid\mathbf{x}^j, \mathbf{e}^j)$, a stopping rule $(\mathbb{\Phi} = (\phi_0, \phi_1, \cdots)$ by the probabilities $\phi_j(\mathbf{x}^j,\mathbf{e}^j) = P\{N = j\mid N \geqq j, \mathbf{x}^j,\mathbf{e}^j\}$, and a terminal action rule $\delta = (\delta_0, \delta_1, \cdots)$ by the conditional densities $\delta_j(a\mid\mathbf{x}^j,\mathbf{e}^j)$. Definition of the population densities $f_\theta(x_{j+1}\mid\mathbf{x}^j, \mathbf{e}^{j+1})$ for $j = 0, 1, 2, \cdots$ completely fixes the probability structure. Define $\{T_j\}$ to be parameter sufficient (PARS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\gamma}(\mathbf{X}^j, \mathbf{E}^j\mid T_j)$ is independent of $\theta$ for all $\gamma$ and policy sufficient (POLS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\Phi,\gamma} (T_{j+1}\mid T_j, E_{j+1}, N \geqq j + 1)$ is independent of $\mathbf{\phi}, \mathbf{\gamma}$ for all $\theta$. THEOREM. If $\{T_j\}$ is PARS; then the class of policies $\{\mathbf{\phi}, \mathbf{\gamma}, \mathbf{\delta}^0\}$, where $\delta^0$ is based on $\{T_j\}$, is essentially complete. THEOREM. If $\{T_j\}$ is PARS and POLS, and the sampling cost is of the form $C_j(\theta, T_j)$, then the class of policies $\{\mathbf{\Phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0\}$, where $\mathbf{\phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0$ are based on $\{T_j\}$, is essentially complete. Conditions are given to aid in the verification of PARS and POLS. The theorems are applied to examples, including versions of the two armed bandit problem.
Publié le : 1968-12-14
Classification: 
@article{1177698025,
     author = {Gray, K. B.},
     title = {Sequential Selection of Experiments},
     journal = {Ann. Math. Statist.},
     volume = {39},
     number = {6},
     year = {1968},
     pages = { 1953-1977},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1177698025}
}
Gray, K. B. Sequential Selection of Experiments. Ann. Math. Statist., Tome 39 (1968) no. 6, pp.  1953-1977. http://gdmltest.u-ga.fr/item/1177698025/