We study estimation in the class of stationary variable length
Markov chains (VLMC) on a finite space. The processes in this class are still
Markovian of high order, but with memory of variable length yielding a much
bigger and structurally richer class of models than ordinary high-order Markov
chains. From an algorithmic view, the VLMC model class has attracted interest
in information theory and machine learning, but statistical properties have not
yet been explored. Provided that good estimation is available, the additional
structural richness of the model class enhances predictive power by finding a
better tradeoff between model bias and variance and allowing better structural
description which can be of specific interest. The latter is exemplified with
some DNA data.
¶ A version of the tree-structured context algorithm, proposed by
Rissanen in an information theoretical setup is shown to have new good
asymptotic properties for estimation in the class of VLMCs. This remains true
even when the underlying model increases in dimensionality. Fur-thermore,
consistent estimation of minimal state spaces and mixing properties of fitted
models are given.
¶ We also propose a new bootstrap scheme based on fitted VLMCs. We
show its validity for quite general stationary categorical time series and for
a broad range of statistical procedures.
Publié le : 1999-04-14
Classification:
Bootstrap,
categorical time series,
central limit theorem,
context algorithm,
data compression,
finite-memory sources,
FSMX model,
Kullback-Leibler distance,
model selection,
tree model.,
62M05,
60J10,
62G09,
62M10,
94A15
@article{1018031204,
author = {B\"uhlmann, Peter and Wyner, Abraham J.},
title = {Variable length Markov chains},
journal = {Ann. Statist.},
volume = {27},
number = {4},
year = {1999},
pages = { 480-513},
language = {en},
url = {http://dml.mathdoc.fr/item/1018031204}
}
Bühlmann, Peter; Wyner, Abraham J. Variable length Markov chains. Ann. Statist., Tome 27 (1999) no. 4, pp. 480-513. http://gdmltest.u-ga.fr/item/1018031204/