Graphics processing units in acceleration of bandwidth selection for kernel density estimation
Witold Andrzejewski ; Artur Gramacki ; Jarosław Gramacki
International Journal of Applied Mathematics and Computer Science, Tome 23 (2013), p. 869-885 / Harvested from The Polish Digital Mathematics Library

The Probability Density Function (PDF) is a key concept in statistics. Constructing the most adequate PDF from the observed data is still an important and interesting scientific problem, especially for large datasets. PDFs are often estimated using nonparametric data-driven methods. One of the most popular nonparametric method is the Kernel Density Estimator (KDE). However, a very serious drawback of using KDEs is the large number of calculations required to compute them, especially to find the optimal bandwidth parameter. In this paper we investigate the possibility of utilizing Graphics Processing Units (GPUs) to accelerate the finding of the bandwidth. The contribution of this paper is threefold: (a) we propose algorithmic optimization to one of bandwidth finding algorithms, (b) we propose efficient GPU versions of three bandwidth finding algorithms and (c) we experimentally compare three of our GPU implementations with the ones which utilize only CPUs. Our experiments show orders of magnitude improvements over CPU implementations of classical algorithms.

Publié le : 2013-01-01
EUDML-ID : urn:eudml:doc:262420
@article{bwmeta1.element.bwnjournal-article-amcv23z4p869bwm,
     author = {Witold Andrzejewski and Artur Gramacki and Jaros\l aw Gramacki},
     title = {Graphics processing units in acceleration of bandwidth selection for kernel density estimation},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {23},
     year = {2013},
     pages = {869-885},
     zbl = {1284.93221},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv23z4p869bwm}
}
Witold Andrzejewski; Artur Gramacki; Jarosław Gramacki. Graphics processing units in acceleration of bandwidth selection for kernel density estimation. International Journal of Applied Mathematics and Computer Science, Tome 23 (2013) pp. 869-885. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv23z4p869bwm/

[000] Andrzejewski, W., Gramacki, A. and Gramacki, J. (2013). Density estimations for approximate query processing on SIMD architectures, Technical Report RA 03/13, Poznań University of Technology, Poznań. | Zbl 1284.93221

[001] Blohsfeld, B., Korus, D. and Seeger, B. (1999). A comparison of selectivity estimators for range queries on metric attributes, Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, USA, pp. 239-250.

[002] Bochkanov, S. and Bystritsky, V. (2013). ALGLIB, http://www.alglib.net.

[003] Chapman, B., Jost, G. and van der Pas, R. (2007). Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation), MIT Press, Cambridge, MA.

[004] Duong, T. (2004). Bandwidth Selectors for Multivariate Kernel Density Estimation, Ph.D. thesis, University of Western Australia, Perth. | Zbl 1060.62042

[005] Farooqui, N., Kerr, A., Diamos, G., Yalamanchili, S. and Schwan, K. (2011). A framework for dynamically instrumenting GPU compute applications within GPU Ocelot, Proceedings of the 4th Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, Newport Beach, CA, USA, pp. 9:1-9:9.

[006] Gramacki, A., Gramacki, J. and Andrzejewski, W. (2010). Probability density functions for calculating approximate aggregates, Foundations of Computing and Decision Sciences 35(4): 223-240. | Zbl 1284.93221

[007] Greengard, L. and Strain, J. (1991). The fast Gauss transform, SIAM Journal on Scientific and Statistical Computing 12(1): 79-94. | Zbl 0721.65089

[008] Harris, M. (2007). Optimizing parallel reduction in CUDA, http://developer.download.nvidia.com/assets/cuda/files/reduction.pdf.

[009] Hendriks, H. and Kim, P. (2003). Consistent and efficient density estimation, in V. Kumar, M.L. Gavrilova, C.J.K. Tan and P. L'Ecuyer (Eds.), Proceedings of the 2003 International Conference on Computational Science and Its Applications, ICCSA 2003: Part I, Lecture Notes in Computer Science, Vol. 2667, Springer-Verlag, New York, NY, Berlin/Heidelberg, pp. 388-397.

[010] Johnson, N., Kotz, S. and Balakrishnan, N. (1994). Continuous Univariate Distributions, Volume 1, Probability and Statistics, John Wiley & Sons, Inc, New York, NY. | Zbl 0811.62001

[011] Johnson, N., Kotz, S. and Balakrishnan, N. (1995). Continuous Univariate Distributions, Volume 2, Probability and Statistics, John Wiley & Sons, Inc, New York, NY. | Zbl 0821.62001

[012] Kulczycki, P. (2005). Kernel Estimators in Systems Analysis, Wydawnictwa Naukowo-Techniczne, Warsaw, (in Polish).

[013] Kulczycki, P. (2008). Kernel estimators in industrial applications, in B. Prasad (Ed.), Studies in Fuzziness and Soft Computing. Soft Computing Applications in Industry, Springer-Verlag, Berlin, pp. 69-91.

[014] Kulczycki, P. and Charytanowicz, M. (2010). A complete gradient clustering algorithm formed with kernel estimators, International Journal of Applied Mathematics and Computer Science 20(1): 123-134, DOI: 10.2478/v10006-010-0009-3. | Zbl 1300.62043

[015] Li, Q. and Racine, J. (2007). Nonparametric Econometrics: Theory and Practice, Princeton University Press, Princeton, NJ. | Zbl 1183.62200

[016] Łukasik, S. (2007). Parallel computing of kernel density estimates with MPI, in Y. Shi, G.D. van Albada, J. Dongarra and P.M.A. Sloot (Eds.), Computational Science-ICCS 2007, Lecture Notes in Computer Science, Vol. 4489, Springer, Berlin/Heidelberg, pp. 726-734.

[017] NVIDIA Corporation (2012). NVIDIA CUDA programming guide, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf.

[018] NVIDIA Corporation (2013). NVIDIA's next generation CUDA compute architecture: Kepler GK110, http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110Architecture-Whitepaper.pdf.

[019] Michailidis, P.D. and Margaritis, K.G. (2013). Accelerating kernel density estimation on the GPU using the CUDA framework, Applied Mathematical Sciences 7(30): 1447-1476.

[020] Nelder, J.A. and Mead, R. (1965). A simplex method for function minimization, The Computer Journal 7(4): 308-313. | Zbl 0229.65053

[021] Raykar, V. and Duraiswami, R. (2006). Very fast optimal bandwidth selection for univariate kernel density estimation, Technical Report CS-TR-4774/UMIACS-TR2005-73, University of Maryland, College Park, MD.

[022] Raykar, V., Duraiswami, R. and Zhao, L. (2010). Fast computation of kernel estimators, Journal of Computational and Graphical Statistics 19(1): 205-220.

[023] Sawerwain, M. (2012). GPU-based parallel algorithms for transformations of quantum states expressed as vectors and density matrices, in R. Wyrzykowski, J. Dongarra, K. Karczewski and J. Waśniewski (Eds.), Parallel Processing and Applied Mathematics, Lecture Notes in Computer Science, Vol. 7203, Springer-Verlag, New York, NY/Berlin/Heidelberg, pp. 215-224.

[024] Sheather, S. (2004). Density estimation, Statistical Science 19(4): 588-597. | Zbl 1100.62558

[025] Silverman, B. (1986). Density Estimation for Statistics and Data Analysis, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, London. | Zbl 0617.62042

[026] Silverman, B.W. (1982). Algorithm AS 176: Kernel density estimation using the fast Fourier transform, Journal of the Royal Statistical Society: Series C (Applied Statistics) 31(1): 93-99. | Zbl 0483.62032

[027] Simonoff, J. (1996). Smoothing Methods in Statistics, Springer Series in Statistics, Springer-Verlag, New York, NY/Berlin/Heidelberg. | Zbl 0859.62035

[028] Wand, M. and Jones, M. (1995). Kernel Smoothing, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Chapman&Hall, London. | Zbl 0854.62043

[029] Xavier, C. and Iyengar, S. (1998). Introduction to Parallel Algorithms, Wiley Series on Parallel and Distributed Computing, Wiley. | Zbl 0948.68220

[030] Yang, C., Duraiswami, R. and Gumerov, N. (2003). Improved fast Gauss transform, Technical Report CS-TR-4495, University of Maryland, College Park, MD.