This work concerns Markov decision processes with finite state space and compact action sets. The decision maker is supposed to have a constant-risk sensitivity coefficient, and a control policy is graded via the risk-sensitive expected total-reward criterion associated with nonnegative one-step rewards. Assuming that the optimal value function is finite, under mild continuity and compactness restrictions the following result is established: If the number of ergodic classes when a stationary policy is used to drive the system depends continuously on the policy employed, then there exists an optimal stationary policy, extending results obtained by Schal (1984) for risk-neutral dynamic programming. We use results recently established for unichain systems, and analyze the general multichain case via a reduction to a model with the unichain property.
@article{bwmeta1.element.bwnjournal-article-doi-10_4064-am28-1-7, author = {Rolando Cavazos-Cadena and Raul Montes-de-Oca}, title = {Stationary optimal policies in a class of multichain positive dynamic programs with finite state space and risk-sensitive criterion}, journal = {Applicationes Mathematicae}, volume = {28}, year = {2001}, pages = {93-109}, zbl = {1039.93067}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-doi-10_4064-am28-1-7} }
Rolando Cavazos-Cadena; Raul Montes-de-Oca. Stationary optimal policies in a class of multichain positive dynamic programs with finite state space and risk-sensitive criterion. Applicationes Mathematicae, Tome 28 (2001) pp. 93-109. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-doi-10_4064-am28-1-7/