Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process
In regression with random design, we study the problem of selecting a model that performs well for out-of-sample prediction. We do not assume that any of the candidate models under consideration are correct. Our analysis is based on explicit finite-sample results. Our main findings differ from those of other analyses that are based on traditional large-sample limit approximations because we consider a situation where the sample size is small relative to the complexity of the data-generating process, in the sense that the number of parameters in a ‘good’ model is of the same order as sample size. Also, we allow for the case where the number of candidate models is (much) larger than sample size.
Publié le : 2008-08-15
Classification:
generalized cross validation,
large number of parameters and small sample size,
model selection,
nonparametric regression,
out-of-sample prediction,
S_p criterion
@article{1219669625,
author = {Leeb, Hannes},
title = {Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process},
journal = {Bernoulli},
volume = {14},
number = {1},
year = {2008},
pages = { 661-690},
language = {en},
url = {http://dml.mathdoc.fr/item/1219669625}
}
Leeb, Hannes. Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli, Tome 14 (2008) no. 1, pp. 661-690. http://gdmltest.u-ga.fr/item/1219669625/