Adaptive control of discrete time Markov processes by the large deviations method

Duncan, T.; Pasik-Duncan, B.; Stettner, Łukasz

Duncan, T. ; Pasik-Duncan, B. ; Stettner, Łukasz

Applicationes Mathematicae, Tome 27 (2000), p. 265-285 / Harvested from The Polish Digital Mathematics Library

Résumé

Some discrete time controlled Markov processes in a locally compact metric space whose transition operators depend on an unknown parameter are described. The adaptive controls are constructed using the large deviations of empirical distributions which are uniform in the parameter that takes values in a compact set. The adaptive procedure uses a finite family of continuous, almost optimal controls. Using the large deviations property it is shown that an adaptive control which is a fixed almost optimal control after a finite time is almost optimal with probability nearly 1.

Publié le : 2000-01-01

Zbl 1006.93071

EUDML-ID : urn:eudml:doc:219273

@article{bwmeta1.element.bwnjournal-article-zmv27i3p265bwm,
     author = {T. Duncan and B. Pasik-Duncan and \L ukasz Stettner},
     title = {Adaptive control of discrete time Markov processes by the large deviations method},
     journal = {Applicationes Mathematicae},
     volume = {27},
     year = {2000},
     pages = {265-285},
     zbl = {1006.93071},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-zmv27i3p265bwm}
}

Duncan, T.; Pasik-Duncan, B.; Stettner, Łukasz. Adaptive control of discrete time Markov processes by the large deviations method. Applicationes Mathematicae, Tome 27 (2000) pp. 265-285. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-zmv27i3p265bwm/

Bibliographie

[000] [1] G. B. Di Masi and Ł. Stettner, Bayesian ergodic adaptive control of discrete time Markov processes, Stochastics Stochastics Rep. 54 (1995), 301-316. | Zbl 0855.93103

[001] [2] M. D. Donsker and S. R. S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, I, Comm. Pure Appl. Math. 28 (1975), 1-47. | Zbl 0323.60069

[002] [3] M. D. Donsker and S. R. S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time--III, ibid. 29 (1976), 389-461. | Zbl 0348.60032

[003] [4] M. Duflo, Formule de Chernoff pour des chaînes de Markov ( d'après Donsker et Varadhan), in: Grandes déviations et applications statistiques, Séminaire Orsay 1977-78, Astérisque 68 (1979), 99-124.

[004] [5] T. E. Duncan, B. Pasik-Duncan and Ł. Stettner, Discretized maximum likelihood and almost optimal control of ergodic Markov models, SIAM J. Control Optim. 36 (1998), 422-446. | Zbl 0914.93076

[005] [6] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, 1976.

[006] [7] N. Maigret, Majorations de Chernoff pour des chaînes de Markov contrôlées, Z. Wahrsch. Verw. Gebiete 51 (1980), 133-151. | Zbl 0397.60051

[007] [8] N. Maigret, Statistiques des chaînes controlés Felleriennes, in: Grandes déviations et applications statistiques, Séminaire Orsay, 1977-1978, Astérisque 68 (1979), 143-169.

[008] [9] Ł. Stettner, On nearly self-optimizing strategies for a discrete-time uniformly ergodic adaptive model, Appl. Math. Optim. 27 (1993), 161-177. | Zbl 0769.93084