Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

Minjárez-Sosa, J.

Applicationes Mathematicae, Tome 26 (1999), p. 267-280 / Harvested from The Polish Digital Mathematics Library

Résumé

We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_{t + 1} = F (x_{t}, a_{t}, ξ_{t})$ , t=1,2,..., with i.i.d. $ℝ^{k}$ -valued random vectors $ξ_{t}$ , which are observable but whose density ϱ is unknown.

Publié le : 1999-01-01

Zbl 1050.93524

EUDML-ID : urn:eudml:doc:219238

@article{bwmeta1.element.bwnjournal-article-zmv26i3p267bwm,
     author = {J. Minj\'arez-Sosa},
     title = {Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion},
     journal = {Applicationes Mathematicae},
     volume = {26},
     year = {1999},
     pages = {267-280},
     zbl = {1050.93524},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-zmv26i3p267bwm}
}

Minjárez-Sosa, J. Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion. Applicationes Mathematicae, Tome 26 (1999) pp. 267-280. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-zmv26i3p267bwm/

Bibliographie

[000] [1] D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962), 719-726. | Zbl 0133.12906

[001] [2] E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979. | Zbl 0073.34801

[002] [3] E. I. Gordienko, Adaptive strategies for certain classes of controlled Markov processes, Theory Probab. Appl. 29 (1985), 504-518. | Zbl 0577.93067

[003] [4] E. I. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199-218. | Zbl 0829.93067

[004] [5] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion, Kybernetika 34 (1998), no. 2, 217-234. | Zbl 1274.90474

[005] [6] E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: average criterion, Math. Methods Oper. Res. 48 (1998), 37-55. | Zbl 0952.90043

[006] [7] R. Hasminskii and I. Ibragimov, On density estimation in the view of Kolmogorov's ideas in approximation theory, Ann. Statist. 18 (1990), 999-1010. | Zbl 0705.62039

[007] [8] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.

[008] [9] O. Hernández-Lerma, Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality, Reporte Interno 165, Departamento de Matemáticas, CINVESTAV-IPN, México, 1994. | Zbl 0906.93062

[009] [10] O. Hernández-Lerma and R. Cavazos-Cadena, Density estimation and adaptive control of Markov processes: average and discounted criteria, Acta Appl. Math. 20 (1990), 285-307. | Zbl 0717.93066

[010] [11] S. A. Lippman, On dynamic programming with unbounded rewards, Manag. Sci. 21 (1975), 1225-1233. | Zbl 0309.90017

[011] [12] P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60. | Zbl 0281.60070

[012] [13] U. Rieder, Measurable selection theorems for optimization problems, Manuscripta Math. 24 (1978), 115-131. | Zbl 0385.28005

[013] [14] J. A. E. E. Van Nunen and J. Wessels, A note on dynamic programming with unbounded rewards, Manag. Sci. 24 (1978), 576-580. | Zbl 0374.49015