Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins

Koltchinskii, Vladimir; Panchenko, Dmitriy; Lozano, Fernando

Koltchinskii, Vladimir ; Panchenko, Dmitriy ; Lozano, Fernando

Ann. Appl. Probab., Tome 13 (2003) no. 1, p. 213-252 / Harvested from Project Euclid

Résumé

A problem of bounding the generalization error of a classifier %\break $f\in \conv(\mathcal{H})$, where $\mathcal{H}$ is a "base" class of functions (classifiers), is considered. This problem frequently occurs in computer learning, where efficient algorithms that combine simple classifiers into a complex one (such as boosting and bagging) have attracted a lot of attention. Using Talagrand's concentration inequalities for empirical processes, we obtain new sharper bounds on the generalization error of combined classifiers that take into account both the empirical distribution of "classification margins" and an "approximate dimension" of the classifiers, and study the performance of these bounds in several experiments with learning algorithms.

Publié le : 2003-01-14
Classification: Generalization error, combined classifier, margin, approximate dimension, empirical process, Rademacher process, random entropies, concentration inequalities, boosting, bagging, 62G05, 62G20, 60F15

@article{1042765667,
     author = {Koltchinskii, Vladimir and Panchenko, Dmitriy and Lozano, Fernando},
     title = {Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins},
     journal = {Ann. Appl. Probab.},
     volume = {13},
     number = {1},
     year = {2003},
     pages = { 213-252},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1042765667}
}

Koltchinskii, Vladimir; Panchenko, Dmitriy; Lozano, Fernando. Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins. Ann. Appl. Probab., Tome 13 (2003) no. 1, pp.  213-252. http://gdmltest.u-ga.fr/item/1042765667/