A Modified Form of the Iterative Method of Dynamic Programming
Hordijk, Arie ; Tijms, Henk
Ann. Statist., Tome 3 (1975) no. 1, p. 203-208 / Harvested from Project Euclid
This paper considers the discrete time finite state Markovian decision problem with the average return criterion. A modified form of the iterative method of dynamic programming is studied. Under the assumption that the maximal average return is independent of the initial state the asymptotic behaviour of the sequence of functions generated by this modified method is found. It is shown that the modified iterative method supplies both upper and lower bounds on the maximal average return and $\varepsilon$-optimal policies. Moreover, a convergence result is proved for the policies produced by the modified iterative method.
Publié le : 1975-01-14
Classification:  Markov decision theory,  average return,  dynamic programming,  modified iterative method,  convergence results,  90C40
@article{1176343008,
     author = {Hordijk, Arie and Tijms, Henk},
     title = {A Modified Form of the Iterative Method of Dynamic Programming},
     journal = {Ann. Statist.},
     volume = {3},
     number = {1},
     year = {1975},
     pages = { 203-208},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176343008}
}
Hordijk, Arie; Tijms, Henk. A Modified Form of the Iterative Method of Dynamic Programming. Ann. Statist., Tome 3 (1975) no. 1, pp.  203-208. http://gdmltest.u-ga.fr/item/1176343008/