Le terme anglais « Approximate Bayesian Computation » (ABC en abrégé) désigne une famille de techniques bayésiennes ayant pour objet la simulation selon une loi de probabilité lorsque la vraisemblance a posteriori n’est pas disponible ou s’avère impossible à évaluer numériquement. Dans le présent article, nous envisageons cette procédure du point de vue de la théorie des -plus proches voisins, en nous attachant plus particulièrement à examiner les propriétés statistiques des sorties de l’algorithme. Cela nous conduit à analyser le comportement asymptotique d’un estimateur de la densité conditionnelle naturellement associé à ABC, utilisé en pratique et possédant à la fois les caractéristiques d’un estimateur des -plus proches voisins et celles d’une méthode à noyau.
Approximate Bayesian Computation (ABC for short) is a family of computational techniques which offer an almost automated solution in situations where evaluation of the posterior likelihood is computationally prohibitive, or whenever suitable likelihoods are not available. In the present paper, we analyze the procedure from the point of view of -nearest neighbor theory and explore the statistical properties of its outputs. We discuss in particular some asymptotic features of the genuine conditional density estimate associated with ABC, which is an interesting hybrid between a -nearest neighbor and a kernel method.
@article{AIHPB_2015__51_1_376_0, author = {Biau, G\'erard and C\'erou, Fr\'ed\'eric and Guyader, Arnaud}, title = {New insights into Approximate bayesian Computation}, journal = {Annales de l'I.H.P. Probabilit\'es et statistiques}, volume = {51}, year = {2015}, pages = {376-403}, doi = {10.1214/13-AIHP590}, mrnumber = {3300975}, zbl = {06412909}, language = {en}, url = {http://dml.mathdoc.fr/item/AIHPB_2015__51_1_376_0} }
Biau, Gérard; Cérou, Frédéric; Guyader, Arnaud. New insights into Approximate bayesian Computation. Annales de l'I.H.P. Probabilités et statistiques, Tome 51 (2015) pp. 376-403. doi : 10.1214/13-AIHP590. http://gdmltest.u-ga.fr/item/AIHPB_2015__51_1_376_0/
[1] On bandwidth variation in kernel estimates – A square root law. Ann. Statist. 10 (1982) 1217–1223. | MR 673656 | Zbl 0507.62040
.[2] Bandwidth selection for kernel conditional density estimation. Comput. Statist. Data Anal. 36 (2001) 279–298. | MR 1836204 | Zbl 1038.62034
and .[3] Adaptive approximate Bayesian computation. Biometrika 96 (2009) 983–990. | MR 2767283 | Zbl 05650366
, , and .[4] Approximate Bayesian computation in population genetics. Genetics 162 (2002) 2025–2035.
, and .[5] On the rate of convergence of the bagged nearest neighbor estimate. J. Mach. Learn. Res. 11 (2010) 687–712. | MR 2600626 | Zbl 1242.62025
, and .[6] Approximate Bayesian computation: A nonparametric perspective. J. Amer. Statist. Assoc. 105 (2010) 1178–1187. | MR 2752613
.[7] Variable kernel estimates of multivariate densities. Technometrics 19 (1977) 135–144. | Zbl 0379.62023
, and .[8] Nearest neighbor classification in infinite dimension. ESAIM Probab. Stat. 10 (2006) 340–355. | Numdam | MR 2247925 | Zbl 1187.62115
and .[9] Estimation by the nearest neighbor rule. IEEE Trans. Inform. Theory 14 (1968) 50–55. | Zbl 0157.49404
.[10] Differentiation of Integrals in . Lecture Notes in Mathematics 481. Springer, Berlin, 1975. | MR 457661 | Zbl 0327.26010
.[11] Necessary and sufficient conditions for the pointwise convergence of nearest neighbor regression function estimates. Z. Wahrsch. Verw. Gebiete 61 (1982) 467–481. | MR 682574 | Zbl 0483.62029
.[12] New multivariate product density estimates. J. Multivariate Anal. 82 (2002) 88–110. | MR 1918616 | Zbl 0995.62034
and .[13] A Probabilistic Theory of Pattern Recognition. Springer, New York, 1996. | MR 1383093 | Zbl 0853.68150
, and .[14] A crossvalidation method for estimating conditional densities. Biometrika 91 (2004) 819–834. | MR 2126035 | Zbl 1078.62032
and .[15] A quantile-copula approach to conditional density estimation. J. Multivariate Anal. 100 (2009) 2083–2099. | MR 2543088 | Zbl 1170.62030
.[16] Constructing summary statistics for approximate Bayesian computation: Semi-automatic approximate Bayesian computation. J. Roy. Statist. Soc. Ser. B 74 (2012) 419–474. | MR 2925370
and .[17] Discriminatory analysis – Nonparametric discrimination: Consistency properties. Project 21-49-004, Report Number 4, USAF School of Aviation Medicine, Randolph Field, TX, 1951. | Zbl 0715.62080
and .[18] Estimating the age of the common ancestor of a sample of DNA sequences. Mol. Biol. Evol. 14 (1997) 195–199.
and .[19] Nonparametric estimation of conditional distributions. IEEE Trans. Inform. Theory 53 (2007) 1872–1879. | MR 2317148 | Zbl 05455560
and .[20] Variable window width kernel estimates of probability densities. Probab. Theory Related Fields 80 (1988) 37–49. | MR 970470 | Zbl 0637.62036
and .[21] Cross-validation and the estimation of conditional probability densities. J. Amer. Statist. Assoc. 99 (2004) 1015–1026. | MR 2109491 | Zbl 1055.62035
, and .[22] Nonparametric conditional density estimation. Technical report, Univ. Wisconsin, 2004.
.[23] Inequalities. Cambridge Univ. Press, Cambridge, 1988. | MR 944909 | Zbl 0634.26008
, and .[24] Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 (1970) 97–109. | Zbl 0219.65008
.[25] Estimating and visualizing conditional densities. J. Comput. Graph. Statist. 5 (1996) 315–336. | MR 1422114
, and .[26] Note on the differentiability of multiple integrals. Fund. Math. 25 (1935) 217–234. | JFM 61.0255.01
, and .[27] Variable kernel density estimates and variable kernel density estimates. Aust. J. Stat. 32 (1990) 361–371. | MR 1098587
.[28] Approximately sufficient statistics and Bayesian computation. Stat. Appl. Genet. Mol. Biol. 7 (2008) Art. ID 26. | MR 2438407 | Zbl 1276.62077
and .[29] On conditional distributions of nearest neighbors. J. Multivariate Anal. 42 (1992) 67–76. | MR 1177518 | Zbl 0773.60037
and .[30] A nonparametric estimate of a multivariate density function. Ann. Math. Statist. 36 (1965) 1049–1051. | MR 176567 | Zbl 0132.38905
and .[31] Multivariate -nearest neighbor density estimates. J. Multivariate Anal. 9 (1979) 1–15. | MR 530638 | Zbl 0406.62023
and .[32] Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer, New York, 2007. | MR 2289769 | Zbl 1137.62013
and .[33] Relevant statistics for Bayesian model choice. J. R. Stat. Soc. Ser B. To appear, 2014. | MR 3271169 | Zbl 1137.62013
, , and .[34] Approximate Bayesian computational methods. Stat. Comput. 22 (2012) 1167–1180. | MR 2992292 | Zbl 1252.62022
, , and .[35] Equations of state calculations by fast computing machines. J. Chem. Phys. 21 (1953) 1087–1091.
, , , and .[36] Consistency properties of nearest neighbor density function estimators. Ann. Statist. 5 (1977) 143–154. | MR 426275 | Zbl 0358.60053
and .[37] Large sample properties of nearest neighbor density function estimators. In Statistical Decision Theory and Related Topics II: Proceedings of a Symposium Held at Purdue University, May 17–19, 1976, S. S. Gupta and D. S. Moore (Eds) 269–279. Academic Press, New York, 1977. | MR 431497 | Zbl 0419.62036
and .[38] On estimating regression. Theory Probab. Appl. 9 (1964) 141–142. | Zbl 0136.40902
.[39] On nonparametric estimates of density functions and regression curves. Theory Probab. Appl. 10 (1965) 186–190. | Zbl 0134.36302
.[40] On the estimation of a probability density function and the mode. Ann. Math. Statist. 33 (1962) 1065–1076. | MR 143282 | Zbl 0116.11302
.[41] Population growth of human Y chromosomes: A study of Y chromosome microsatellites. Mol. Biol. Evol. 16 (1999) 1791–1798.
, , and .[42] Stochastic Simulation. Wiley, New York, 1982. | MR 2299137 | Zbl 0613.65006
.[43] Monte Carlo Statistical Methods, 2nd edition. Springer, New York, 2004. | MR 2080278 | Zbl 1096.62003 | Zbl 0935.62005
and .[44] Lack of confidence in approximate Bayesian computation model choice. Proc. Natl. Acad. Sci. USA 108 (2011) 15112–15117.
, , and .[45] Conditional probability density and regression estimates. In Multivariate Analysis II, P. R. Krishnaiah (Ed.) 25–31. Academic Press, New York, 1969. | MR 254987
.[46] A class of non-parametric estimates of a smooth regression function. Ph.D. thesis, Stanford Univ., 1966. | MR 2615964
.[47] Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 (1984) 1151–1172. | MR 760681 | Zbl 0555.62010
.[48] Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 104 (2007) 1760–1765. | MR 2301870 | Zbl 1160.65005
, and .[49] Singular Integrals and Differentiability Properties of Functions. Princeton Univ. Press, Princeton, 1970. | MR 290095 | Zbl 0207.13501
.[50] Consistent nonparametric regression. Ann. Statist. 5 (1977) 595–645. | MR 443204 | Zbl 0366.62051
.[51] Inferring coalescence times from DNA sequence data. Genetics 145 (1997) 505–518.
, , and .[52] Smooth regression analysis. Sankhya A 26 (1964) 359–372. | MR 185765 | Zbl 0137.13002
.[53] Measure and Integral. An Introduction to Real Analysis. Marcel Dekker, New York, 1977. | MR 492146 | Zbl 0362.26004
and .[54] Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat. Appl. Genet. Mol. Biol. 12 (2008) 129–141. | MR 3071024
.[55] Trigonometric Series, Vol. II. Cambridge Univ. Press, Cambridge, 1959. | MR 107776 | Zbl 0085.05601
.