Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process

Leeb, Hannes

Leeb, Hannes

Bernoulli, Tome 14 (2008) no. 1, p. 661-690 / Harvested from Project Euclid

Text on Project Euclid
PDF
Alt PDF

Résumé

In regression with random design, we study the problem of selecting a model that performs well for out-of-sample prediction. We do not assume that any of the candidate models under consideration are correct. Our analysis is based on explicit finite-sample results. Our main findings differ from those of other analyses that are based on traditional large-sample limit approximations because we consider a situation where the sample size is small relative to the complexity of the data-generating process, in the sense that the number of parameters in a ‘good’ model is of the same order as sample size. Also, we allow for the case where the number of candidate models is (much) larger than sample size.

Publié le : 2008-08-15
Classification: generalized cross validation, large number of parameters and small sample size, model selection, nonparametric regression, out-of-sample prediction, S_p criterion

@article{1219669625,
     author = {Leeb, Hannes},
     title = {Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process},
     journal = {Bernoulli},
     volume = {14},
     number = {1},
     year = {2008},
     pages = { 661-690},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1219669625}
}

Leeb, Hannes. Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli, Tome 14 (2008) no. 1, pp.  661-690. http://gdmltest.u-ga.fr/item/1219669625/