The paper deals with continuous time Markov decision processes on a fairly general state space. The rewards are continuously discounted at rate $\alpha > 0$. A set of conditions is shown to be necessary and sufficient for a policy to be optimal. For the special case of time independent reward function and under the assumption that the action space is finite a policy improvement algorithm is proposed and its convergence to an optimal policy is proved.
Publié le : 1976-11-14
Classification:
Continuous time Markov decision processes,
optimal discounted return function,
optimal policy,
49C15,
60K99
@article{1176343653,
author = {Doshi, Bharat T.},
title = {Continuous Time Control of Markov Processes on an Arbitrary State Space: Discounted Rewards},
journal = {Ann. Statist.},
volume = {4},
number = {1},
year = {1976},
pages = { 1219-1235},
language = {en},
url = {http://dml.mathdoc.fr/item/1176343653}
}
Doshi, Bharat T. Continuous Time Control of Markov Processes on an Arbitrary State Space: Discounted Rewards. Ann. Statist., Tome 4 (1976) no. 1, pp. 1219-1235. http://gdmltest.u-ga.fr/item/1176343653/