Markov Decision Processes with a New Optimality Criterion: Continuous Time
Jaquette, Stratton C.
Ann. Statist., Tome 3 (1975) no. 1, p. 547-553 / Harvested from Project Euclid
Standard finite state and action continuous time Markov decision processes with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). It is shown constructively that a stationary policy is moment optimal among the class of piecewise constant policies by examining the negative of the Laplace transform of the total return random variable and its Taylor series expansion.
Publié le : 1975-03-14
Classification:  Dynamic programming,  Markov decision processes,  optimality criterion,  moments of return,  90B99,  60J25,  93E20,  90C40,  90B99
@article{1176343087,
     author = {Jaquette, Stratton C.},
     title = {Markov Decision Processes with a New Optimality Criterion: Continuous Time},
     journal = {Ann. Statist.},
     volume = {3},
     number = {1},
     year = {1975},
     pages = { 547-553},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176343087}
}
Jaquette, Stratton C. Markov Decision Processes with a New Optimality Criterion: Continuous Time. Ann. Statist., Tome 3 (1975) no. 1, pp.  547-553. http://gdmltest.u-ga.fr/item/1176343087/