We prove new probabilistic upper bounds on generalization error of
complex classifiers that are combinations of simple classifiers. Such
combinations could be implemented by neural networks or by voting methods of
combining the classifiers, such as boosting and bagging. The bounds are in
terms of the empirical distribution of the margin of the combined classifier.
They are based on the methods of the theory of Gaussian and empirical processes
(comparison inequalities, symmetrization method, concentration inequalities)
and they improve previous results of Bartlett (1998) on bounding the
generalization error of neural networks in terms of $\ell_1$-norms of the
weights of neurons and of Schapire, Freund, Bartlett and Lee (1998) on bounding
the generalization error of boosting. We also obtain rates of convergence in
Lévy distance of empirical margin distribution to the true margin
distribution uniformly over the classes of classifiers and prove the optimality
of these rates.
@article{1015362183,
author = {Koltchinskii, V. and Panchenko, D.},
title = {Empirical Margin Distributions and Bounding the Generalization
Error of Combined Classifiers},
journal = {Ann. Statist.},
volume = {30},
number = {1},
year = {2002},
pages = { 1-50},
language = {en},
url = {http://dml.mathdoc.fr/item/1015362183}
}
Koltchinskii, V.; Panchenko, D. Empirical Margin Distributions and Bounding the Generalization
Error of Combined Classifiers. Ann. Statist., Tome 30 (2002) no. 1, pp. 1-50. http://gdmltest.u-ga.fr/item/1015362183/