L'objectif de cet article est de construire un estimateur d'une densité inconnue comme combinaison linéaire de fonctions d'un dictionnaire. Inspirés par l'approche de Candès et Tao, nous proposons une minimisation de la norme ℓ1 des coefficients dans la combinaison linéaire sous une contrainte de Dantzig adaptative issue d'inégalités de concentration précises. Ceci nous permet de considérer une large classe de dictionnaires. Sous des hypothèses de structure locale ou globale, nous obtenons des inégalités oracles. Ces résultats théoriques sont transposés à l'estimateur Lasso adaptatif naturellement associé à notre procédure de Dantzig. Le problème de la calibration de ces procédures est alors étudié à la fois du point de vue théorique et du point de vue pratique. Enfin, une étude numérique montre l'amélioration significative obtenue par notre procédure en comparaison d'autres procédures plus classiques.
The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao's approach, we propose a minimization of the ℓ1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.
@article{AIHPB_2011__47_1_43_0, author = {Bertin, K. and Le Pennec, E. and Rivoirard, V.}, title = {Adaptive Dantzig density estimation}, journal = {Annales de l'I.H.P. Probabilit\'es et statistiques}, volume = {47}, year = {2011}, pages = {43-74}, doi = {10.1214/09-AIHP351}, mrnumber = {2779396}, zbl = {1207.62077}, language = {en}, url = {http://dml.mathdoc.fr/item/AIHPB_2011__47_1_43_0} }
Bertin, K.; Le Pennec, E.; Rivoirard, V. Adaptive Dantzig density estimation. Annales de l'I.H.P. Probabilités et statistiques, Tome 47 (2011) pp. 43-74. doi : 10.1214/09-AIHP351. http://gdmltest.u-ga.fr/item/AIHPB_2011__47_1_43_0/
[1] Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245-279.
and .[2] Dantzig selector homotopy with dynamic measurements. In Proceedings of SPIE Computational Imaging VII 7246 (2009) 72460E.
and .[3] Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705-1732. | MR 2533469 | Zbl 1173.62022
, and .[4] Model selection for density estimation with L2-loss, 2008. Available at arXiv 0808.1416.
.[5] Minimal penalties for Gaussian model selection. Probab. Theory Related. Fields 138 (2007) 33-73. | MR 2288064 | Zbl 1112.62082
and .[6] Aggregation and sparsity via ℓ1 penalized least squares. In Learning Theory 379-391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006. | MR 2280619 | Zbl 1143.62319
, and .[7] Sparse density estimation with ℓ1 penalties. Learning Theory 530-543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. | Zbl 1203.62053
, and .[8] Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 (2007) 169-194. | MR 2312149 | Zbl 1146.62028
, and .[9] Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674-1697. | MR 2351101 | Zbl 1209.62065
, and .[10] Spades and mixture models. Ann. Statist. (2010). To appear. Available at arXiv 0901.2044. | MR 2676897 | Zbl 1198.62025
, , and .[11] Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Cartributions in Honor of J. K. Ghosh 122-137. Inst. Math. Stat. Collect 3. IMS, Beachwood, OH, 2008. | MR 2459221
.[12] | MR 2543688 | Zbl 1173.62053
and Y. Plan. Near-ideal model selection by ℓ1 minimization. Ann. Statist. 37 (2009) 2145-2177.[13] The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313-2351. | MR 2382644 | Zbl 1139.62019
and .[14] Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001) 129-159. | MR 1854649 | Zbl 0979.94010
, and .[15] Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 (2006) 6-18. | MR 2237332
, and .[16] Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 (1994) 425-455. | MR 1311089 | Zbl 0815.62019
and .[17] Least angle regression. Ann. Statist. 32 (2004) 407-499. | MR 2060166 | Zbl 1091.62054
, , and .[18] On minimax density estimation on ℝ. Bernoulli 10 (2004) 187-220. | MR 2046772 | Zbl 1076.62037
and .[19] Asymptotics for Lasso-type estimators. Ann. Statist. 28 (2000) 1356-1378. | MR 1805787 | Zbl 1105.62357
and .[20] Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 (2008) 90-102. | MR 2386087 | Zbl pre05274636
.[21] Concentration inequalities and model selection. Lecture Notes in Math. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour July 6-23 2003, 2007. | MR 2319879 | Zbl 1170.60006
.[22] High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436-1462. | MR 2278363 | Zbl 1113.62082
and .[23] | MR 2488351 | Zbl 1155.62050
and -type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 (2009) 246-270.[24] On the Lasso and its dual. J. Comput. Graph. Statist. 9 (2000) 319-337. | MR 1822089
, and .[25] A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389-404. | MR 1773265 | Zbl 0962.65036
, and .[26] Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Statist. 4 (2010) 172-238. | MR 2645482
and .[27] Adaptive density estimation: A curse of support? 2009. Available at arXiv 0907.1794. | Zbl 1197.62033
, and .[28] Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267-288. | MR 1379242 | Zbl 0850.62538
.[29] High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614-645. | MR 2396809 | Zbl 1138.62323
.[30] On model selection consistency of Lasso estimators. J. Mach. Learn. Res. 7 (2006) 2541-2567. | MR 2274449
and .[31] The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567-1594. | MR 2435448 | Zbl 1142.62044
and .[32] The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418-1429. | MR 2279469 | Zbl 1171.62326
.