On adaptive control of Markov chains using nonparametric estimation

Drabik, Ewa; Stettner, Łukasz

Drabik, Ewa ; Stettner, Łukasz

Applicationes Mathematicae, Tome 27 (2000), p. 143-152 / Harvested from The Polish Digital Mathematics Library

Access to full text
Full (PDF)

Résumé

Two adaptive procedures for controlled Markov chains which are based on a nonparametric window estimation are shown.

Publié le : 2000-01-01

Zbl 1006.93069

EUDML-ID : urn:eudml:doc:219263

@article{bwmeta1.element.bwnjournal-article-zmv27i2p143bwm,
     author = {Ewa Drabik and \L ukasz Stettner},
     title = {On adaptive control of Markov chains using nonparametric estimation},
     journal = {Applicationes Mathematicae},
     volume = {27},
     year = {2000},
     pages = {143-152},
     zbl = {1006.93069},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-zmv27i2p143bwm}
}

Drabik, Ewa; Stettner, Łukasz. On adaptive control of Markov chains using nonparametric estimation. Applicationes Mathematicae, Tome 27 (2000) pp. 143-152. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-zmv27i2p143bwm/

Bibliographie

[000] [1] R. Agraval, The continuum-armed bandit problem, SIAM J. Control Optim. 33 (1995), 1926-1951. | Zbl 0848.93069

[001] [2] V. S. Borkar, Recursive self-tuning of finite Markov chains, Appl. Math. (Warsaw) 24 (1996), 169-188. | Zbl 0951.93537

[002] [3] E. Drabik, On nearly selfoptimizing strategies for multiarmed bandit problems with controlled arms, ibid. 23 (1996), 449-473. | Zbl 0848.93068

[003] [4] T. Duncan, B. Pasik-Duncan and Ł. Stettner, Discretized maximum likelihood and almost optimal adaptive control of ergodic adaptive models, SIAM J. Control Optim. 36 (1998), 422-446. | Zbl 0914.93076

[004] [5] T. Duncan, B. Pasik-Duncan and Ł. Stettner, Adaptive control of discrete Markov processes by the method of large deviations, in: Proc. 35th IEEE CDC, Kobe 1996, IEEE, 360-365. | Zbl 1006.93071

[005] [6] O. Hernández-Lerma and R. Cavazos-Cadena, Density estimation and adaptive control of Markov processes; average and discounted criteria, Acta Appl. Math. 20 (1990), 285-307. | Zbl 0717.93066

[006] [7] A. Nowak, A generalization of Ueno's inequality for n-step transition probabilities, Appl. Math. (Warsaw) 25 (1998), 295-299. | Zbl 0998.60068