The method of change (or anomaly) detection in high-dimensional discrete-time processes using a multivariate Hotelling chart is presented. We use normal random projections as a method of dimensionality reduction. We indicate diagnostic properties of the Hotelling control chart applied to data projected onto a random subspace of Rn . We examine the random projection method using artificial noisy image sequences as examples.
@article{bwmeta1.element.bwnjournal-article-amcv23z2p447bwm, author = {Ewa Skubalska-Rafaj\l owicz}, title = {Random projections and hotelling's T$^2$ statistics for change detection in high-dimensional data streams}, journal = {International Journal of Applied Mathematics and Computer Science}, volume = {23}, year = {2013}, pages = {447-461}, zbl = {1282.93064}, language = {en}, url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv23z2p447bwm} }
Ewa Skubalska-Rafajłowicz. Random projections and hotelling's T² statistics for change detection in high-dimensional data streams. International Journal of Applied Mathematics and Computer Science, Tome 23 (2013) pp. 447-461. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv23z2p447bwm/
[000] Achlioptas, D. (2001 ). Database friendly random projections, Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Santa Barbara, CA, USA, pp. 274-281.
[001] Ailon, N. and Chazelle, B. (2006). Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform, Proceedings of the 38th Annual ACM Symposium on Theory of Computing, Seattle, WA, USA, pp. 557-563. | Zbl 1301.68232
[002] Arriaga, R. and Vempala, S.(1999). An algorithmic theory of learning: Robust concepts and random projection, Proceedings of the 40th Annual IEEE Symposium on the Foundations of Computer Science, New York, NY, USA, pp. 616-623. | Zbl 1095.68092
[003] Biau, G. and Devroye, L. and Lugosi, G. (2008). On the performance of clustering in Hilbert spaces IEEE Transactions on Information Theory 54(2): 781-790. | Zbl 1304.62088
[004] Bodnar, O. and Schmid, W. (2005). Multivariate control charts based on a projection approach Allgemeines Statistisches Archiv 89(1): 75-93. | Zbl 05244505
[005] Chandola, V., Banerjee, A. and Kumar, V. (2009). Anomaly detection: A survey, ACM Computing Surveys 41(3): 15:1-15:58.
[006] Cramer, H. and Wold, H.(1936). Some theorems on distribution functions, Journal of the London Mathematical Society 11(2): 290-295. | Zbl 62.0596.04
[007] Cuesta-Albertos, J.A., del Barrio, E., Fraiman, R. and Matran, C. (2007). The random projection method in goodness of fit for functional data, Computational Statistics and Data Analysis 51(10): 4814-4831. | Zbl 1162.62363
[008] Cuturi, M., Vert, J-P. and dAspremont, A. (2009). White functionals for anomaly detection in dynamical systems, in Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I. Williams and A. Culotta (Eds.), Advances in Neural Information Processing Systems, Vol. 22, MIT Press, Vancouver, pp. 432-440.
[009] Dasgupta, S. and Gupta, A. (2003). An elementary proof of a theorem of Johnson and Lindenstrauss, Random Structures and Algorithms 22(1): 60-65. | Zbl 1018.51010
[010] Donoho D.L. (2000 ). High-dimensional data analysis: The curses and blessings of dimensionality, Technical report, Department of Statistics, Stanford University, Stanford, CA.
[011] Frankl, P. and Maehara, H. (1987). The Johnson-Lindenstrauss lemma and the sphericity of some graphs, Journal of Combinatorial Theory A 44(3): 355-362. | Zbl 0675.05049
[012] Forbes, C., Evans, M. and Hastings, N. and Peacock, B. (2011). Statistical Distributions, 4th Edn., John Wiley and Sons, Inc., Hoboken, NJ. | Zbl 1258.62012
[013] Hyvärinen, A., Karhunen, J. and Oja, E. (2001). Independent Component Analysis, Wiley, New York, NY.
[014] Hotelling, H. (1931). The generalization of Student's ratio The Annals of Mathematical Statistics 2(3): 360-378. | Zbl 0004.26503
[015] Indyk, P. and Motwani, R. (1998). Approximate nearest neighbors: Towards removing the curse of dimensionality, Proceedings of the 30th Annual ACM Symposium on the Theory of Computing, Dallas, TX, USA, pp. 604-613. | Zbl 1029.68541
[016] Indyk, P. and Naor, A.(2007). Nearest neighbor preserving embeddings, ACM Transactions on Algorithms 3(3): 31:1-31:12. | Zbl 1192.68748
[017] Jolliffe, I.T. (1986). Principal Component Analysis, Springer-Verlag, New York, NY. | Zbl 1011.62064
[018] Johnson, W.B. and Lindenstrauss, J.(1984). Extensions of Contemporary Lipschitz mapping into Hilbert space, Mathematics 26: 189-206. | Zbl 0539.46017
[019] Korbicz, J., Kościelny, J.M., Kowalczuk, Z. and Cholewa, W. (Eds.) (2004). Fault Diagnosis. Models, Artificial Intelligence, Applications. Springer Verlag, Berlin/Heidelberg/New York, NY. | Zbl 1074.93004
[020] Lee, J.A. and Verleysen, M. (2007). Nonlinear Dimensionality Reduction, Springer, New York, NY. | Zbl 1128.68024
[021] Li, P., Hastie, T.J. and Church, K.W. (2006a). Nonlinear estimators and tail bounds for dimension reduction in L1 using Cauchy random projections, Technical report, Department of Statistics, Stanford University, Stanford, CA. | Zbl 1203.68160
[022] Li, P., Hastie, T.J. and Church, K.W. (2006b). Sub-Gaussian random projections, Technical report, Department of Statistics, Stanford University, Stanford, CA.
[023] Mason, R.L., Tracy, N.D. and Young, J.C., (1992). Multivariate control charts for individual observations, Journal of Quality Technology 24(2): 88-95.
[024] Mason, R.L. and Young, J.C. (2002). Multivariate Statistical Process Control with Industrial Application, SIAM, Philadelphia, PA. | Zbl 0989.62075
[025] Mathai, A.M. and Provost, S.B. (1992). Quadratic Forms in Random Variables: Theory and Applications, Marcel Dekker, New York, NY. | Zbl 0792.62045
[026] Matouŝek, J.(2008). On variants of the Johnson-Lindenstrauss lemma, Random Structures and Algorithms 33(2): 142-156. | Zbl 1154.51002
[027] Milman, V.(1971). A new proof of the theorem of A. Dvoretzky on sections of convex bodies, Functional Analysis and Its Applications 5(4): 28-37, (English translation).
[028] Montgomery, D.C. (1996 ). Introduction to Statistical Quality Control, 3rd Edn., John Wiley and Sons, New York, NY. | Zbl 0997.62503
[029] Qin, S.J.(2003). Statistical process monitoring: Basics and beyond Journal of Chemometrics 17(8-9): 480-502.
[030] Rao, C.R. (1973). Linear Statistical Inference and Its Applications, John Wiley and Sons, New York, NY/London/Sydney/Toronto. | Zbl 0256.62002
[031] Runger, G.C. (1996). Projections and the U-squared multivariate control chart, Journal of Quality Technology 28(3): 313-319.
[032] Runger, G., Barton, R., Del Castillo, E. and Woodall, W.H. (2007). Optimal monitoring of multivariate data for fault patterns, Journal of Quality Technology 39(2): 159-172.
[033] Skubalska-Rafajłowicz, E. (2006). RBF neural network for probability density function estimation and detecting changes in multivariate processes, in L. Rutkowski, R. Tadeusiewicz, L.A. Zadeh and J. Żurada (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 4029, Springer-Verlag, Berlin/Heidelberg, pp. 133-141.
[034] Skubalska-Rafajłowicz, E. (2008). Random projection RBF nets for multidimensional density estimation, International Journal of Applied Mathematics and Computer Science 18(4): 455-464, DOI: 10.2478/v10006-008-0040-9. | Zbl 1155.93428
[035] Skubalska-Rafajłowicz, E. (2009). Neural networks with sigmoidal activation functions dimension reduction using normal random projection, Nonlinear Analysis 71(12): e1255-e1263.
[036] Skubalska-Rafajłowicz, E. (2011). Fast and efficient method of change detection in statistically monitored high-dimensional data streams, Proceedings of the 10th International Science and Technology Conference on Diagnostics of Processes and Systems, Zamość, Poland, pp. 256-260.
[037] Srivastava, M.S. (2009). A review of multivariate theory for high dimensional data with fewer observations, in A. SenGupta (Ed.), Advances in Multivariate Statistical Methods, Vol. 9, World Scientific, Singapore, pp. 25-52.
[038] Sulliva, J.H. and Woodall, W.H. (2000). Change-point detection of mean vector or covariance matrix shifts using multivariate individual observations, IIE Transactions 32(6): 537-549.
[039] Tsung F. and Wang K. (2010). Adaptive charting techniques: Literature review and extensions, in H.-J. Lenz, P.-T. Wilrich and W. Schmid (Eds.), Frontiers in Statistical Quality Control, Vol. 9, Springer-Verlag, Berlin/Heidelberg, pp. 19-35.
[040] Vempala, S. (2004). The Random Projection Method, American Mathematical Society, Providence, RI. | Zbl 1058.68063
[041] Wang, K. and Jiang, W. (2009). High-dimensional process monitoring and fault isolation via variable selection, Journal of Quality Technology 41(3): 247-258.
[042] Wang, J. (2012). Geometric Structure of High-Dimensional Data and Dimensionality Reduction, Higher Education Press, Beijing/Springer-Verlag, Berlin/Heidelberg. | Zbl 1250.68010
[043] Wold, H. (1966). Estimation of principal components and related models by iterative least squares in P. Krishnaiaah (Ed.), Multivariate Analysis, Academic Press, New York, NY, pp. 391-420.
[044] Zorriassatine, F., Tannock, J.D.T. and O‘Brien, C. (2003). Using novelty detection to identify abnormalities caused by mean shifts in bivariate processes, Computers and Industrial Engineering 44(3): 385-408.