A survey of some simulation-based algorithms for Markov decision processes

Chang, Hyeong Soo; Fu, Michael C.; Hu, Jiaqiao; Marcus, Steven I.

Chang, Hyeong Soo ; Fu, Michael C. ; Hu, Jiaqiao ; Marcus, Steven I.

Commun. Inf. Syst., Tome 7 (2007) no. 1, p. 59-92 / Harvested from Project Euclid

Résumé

Many problems modeled by Markov decision processes (MDPs) have very large state and/or action spaces, leading to the well-known curse of dimensionality that makes solution of the resulting models intractable. In other cases, the system of interest is complex enough that it is not feasible to explicitly specify some of the MDP model parameters, but simulated sample paths can be readily generated (e.g., for random state transitions and rewards), albeit at a non-trivial computational cost. For these settings, we have developed various sampling and population-based numerical algorithms to overcome the computational difficulties of computing an optimal solution in terms of a policy and/or value function. Specific approaches presented in this survey include multi-stage adaptive sampling, evolutionary policy iteration and evolutionary random policy search.

Publié le : 2007-05-14
Classification: (adaptive) sampling, Markov decision process, population-based algorithms

@article{1184963898,
     author = {Chang, Hyeong Soo and Fu, Michael C. and Hu, Jiaqiao and Marcus, Steven I.},
     title = {A survey of some simulation-based algorithms for Markov decision processes},
     journal = {Commun. Inf. Syst.},
     volume = {7},
     number = {1},
     year = {2007},
     pages = { 59-92},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1184963898}
}

Chang, Hyeong Soo; Fu, Michael C.; Hu, Jiaqiao; Marcus, Steven I. A survey of some simulation-based algorithms for Markov decision processes. Commun. Inf. Syst., Tome 7 (2007) no. 1, pp.  59-92. http://gdmltest.u-ga.fr/item/1184963898/