The Optimal Reward Operator in Dynamic Programming
Blackwell, D. ; Freedman, D. ; Orkin, M.
Ann. Probab., Tome 2 (1974) no. 6, p. 926-941 / Harvested from Project Euclid
Consider a dynamic programming problem with analytic state space $S$, analytic constraint set $A$, and semi-analytic reward function $r(x, P, y)$ for $(x, P)\in A$ and $y\in S$: namely, $\{r > a\}$ is an analytic set for all $a$. Let $Tf$ be the optimal reward in one move, with the modified reward function $r(x, P, y) + f(y)$. The optimal reward in $n$ moves is shown to be $T^n0$, a semi-analytic function on $S$. It is also shown that for any $n$ and positive $\varepsilon$, there is an $\varepsilon$-optimal strategy for the $n$-move game, measurable on the $\sigma$-field generated by the analytic sets.
Publié le : 1974-10-14
Classification:  Dynamic programming,  optimal reward,  optimal strategy,  analytic sets,  gambling,  49C99,  60K99,  90C99,  28A05
@article{1176996558,
     author = {Blackwell, D. and Freedman, D. and Orkin, M.},
     title = {The Optimal Reward Operator in Dynamic Programming},
     journal = {Ann. Probab.},
     volume = {2},
     number = {6},
     year = {1974},
     pages = { 926-941},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176996558}
}
Blackwell, D.; Freedman, D.; Orkin, M. The Optimal Reward Operator in Dynamic Programming. Ann. Probab., Tome 2 (1974) no. 6, pp.  926-941. http://gdmltest.u-ga.fr/item/1176996558/