Discriminant analysis for two data sets in $\mathbb{R}^d$ with
probability densities $f$ and $g$ can be based on the estimation of the set $G
= \{x: f(x) \geq g(x)\}$. We consider applications where it is appropriate to
assume that the region $G$ has a smooth boundary or belongs to another
nonparametric class of sets. In particular, this assumption makes sense if
discrimination is used as a data analytic tool. Decision rules based on
minimization of empirical risk over the whole class of sets and over sieves are
considered. Their rates of convergence are obtained. We show that these rules
achieve optimal rates for estimation of $G$ and optimal rates of convergence
for Bayes risks. An interesting conclusion is that the optimal rates for Bayes
risks can be very fast, in particular, faster than the
“parametric” root-$n$ rate. These fast rates cannot be guaranteed
for plug-in rules.
@article{1017939240,
author = {Mammen, Enno and Tsybakov, Alexandre B.},
title = {Smooth discrimination analysis},
journal = {Ann. Statist.},
volume = {27},
number = {4},
year = {1999},
pages = { 1808-1829},
language = {en},
url = {http://dml.mathdoc.fr/item/1017939240}
}
Mammen, Enno; Tsybakov, Alexandre B. Smooth discrimination analysis. Ann. Statist., Tome 27 (1999) no. 4, pp. 1808-1829. http://gdmltest.u-ga.fr/item/1017939240/