This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the existence of AC-optimal and AC-canonical policies respectively.
@article{bwmeta1.element.bwnjournal-article-zmv23i2p199bwm, author = {Evgueni Gordienko and On\'esimo Hern\'andez-Lerma}, title = {Average cost Markov control processes with weighted norms: existence of canonical policies}, journal = {Applicationes Mathematicae}, volume = {23}, year = {1995}, pages = {199-218}, zbl = {0829.93067}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-zmv23i2p199bwm} }
Gordienko, Evgueni; Hernández-Lerma, Onésimo. Average cost Markov control processes with weighted norms: existence of canonical policies. Applicationes Mathematicae, Tome 23 (1995) pp. 199-218. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-zmv23i2p199bwm/
[000] [1] A. Arapostathis, V. S. Borkar, E. Fernández-Gaucherand, M. K. Ghosh and S. I. Marcus, Discrete-time controlled Markov processes with average cost criterion: a survey, SIAM J. Control Optim. 31 (1993), 282-344. | Zbl 0770.93064
[001] [2] D. P. Bersekas and S. E. Shreve, Stochastic Optimal Control: The Discrete Time Case, Academic Press, New York, 1978.
[002] [3] E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979. | Zbl 0073.34801
[003] [4] E. I. Gordienko, Controlled Markov processes with slowly varying characteristics. The problem of adaptive control. I, Soviet J. Comput. Syst. Sci. 23 (1985), 87-95. | Zbl 0595.93069
[004] [5] E. I. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: value iteration, this volume, 219-237. | Zbl 0829.93068
[005] [6] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.
[006] [7] O. Hernández-Lerma, Average optimality in dynamic programming on Borel spaces-unbounded costs and controls, Systems Control Lett. 17 (1991), 237-242. | Zbl 0771.90098
[007] [8] O. Hernández-Lerma and J. B. Lasserre, Average cost optimal policies for Markov control processes with Borel state space and unbounded costs, ibid. 15 (1990), 349-356. | Zbl 0723.93080
[008] [9] O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes, book in preparation. | Zbl 0724.93087
[009] [10] O. Hernández-Lerma, R. Montes-de-Oca and R. Cavazos-Cadena, Recurrence conditions for Markov decision processes with Borel state space: a survey, Ann. Oper. Res. 28 (1991), 29-46.
[010] [11] K. Hinderer, Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter, Lecture Notes Oper. Res. 33, Springer, New York, 1970. | Zbl 0202.18401
[011] [12] N. V. Kartashov, Inequalities in theorems of ergodicity and stability of Markov chains with common phase space. I, Theory Probab. Appl. 30 (1985), 247-259. | Zbl 0657.60088
[012] [13] N. V. Kartashov, Inequalities in theorems of ergodicity and stability of Markov chains with common phase space. II, ibid. 30 (1985), 507-515. | Zbl 0619.60066
[013] [14] N. V. Kartashov, Strongly stable Markov chains, J. Soviet Math. 34 (1986), 1493-1498. | Zbl 0594.60069
[014] [15] V. K. Malinovskiĭ, Limit theorems for Harris Markov chains, I, Theory Probab. Appl. 31 (1986), 269-285.
[015] [16] R. Montes-de-Oca and O. Hernández-Lerma, Conditions for average optimality in Markov control processes with unbounded costs and controls, J. Math. Systems Estim. Control 4 (1994), 1-19. | Zbl 0812.93077
[016] [17] R. Montes-de-Oca and O. Hernández-Lerma, Value iteration in average cost Markov control processes on Borel spaces, Acta Appl. Math., to appear. | Zbl 0843.93093
[017] [18] E. Nummelin, General Irreducible Markov Chains and Non-Negative Operators, Cambridge University Press, Cambridge, 1984. | Zbl 0551.60066
[018] [19] E. Nummelin and P. Tuominen, Geometric ergodicity of Harris recurrent Markov chains with applications to renewal theory, Stochastic Process. Appl. 12 (1982), 187-202. | Zbl 0484.60056
[019] [20] S. Orey, Limit Theorems for Markov Chain Transition Probabilities, Van Nostrand Reinhold, London, 1971. | Zbl 0295.60054
[020] [21] U. Rieder, Measurable selection theorems for optimization problems, Manuscripta Math. 24 (1978), 115-131. | Zbl 0385.28005
[021] [22] H. L. Royden, Real Analysis, 2nd ed., Macmillan, New York, 1971. | Zbl 0197.03501
[022] [23] M. Schäl, Conditions for optimality and for the limit of n-stage optimal policies to be optimal, Z. Wahrsch. Verw. Gebiete 32 (1975), 179-196. | Zbl 0316.90080
[023] [24] M. Schäl, Average optimality in dynamic programming with general state space, Math. Oper. Res. 18 (1993), 163-172. | Zbl 0777.90079
[024] [25] R. Sznajder and J. A. Filar, Some comments on a theorem of Hardy and Littlewood, J. Optim. Theory Appl. 75 (1992), 201-209.