On the distribution of the largest eigenvalue in principal components analysis
Johnstone, Iain M.
Ann. Statist., Tome 29 (2001) no. 2, p. 295-327 / Harvested from Project Euclid
Let x(1) denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x(1) is the largest principal component variance of the covariance matrix $X'X$, or the largest eigenvalue of a p­variate Wishart distribution on n degrees of freedom with identity covariance. ¶ Consider the limit of large p and n with $n/p = \gamma \ge 1$. When centered by $\mu_p = (\sqrt{n-1} + \sqrt{p})^2$ and scaled by $\sigma_p = (\sqrt{n-1} + \sqrt{p})(1/\sqrt{n-1} + 1/\sqrt{p}^{1/3}$, the distribution of x(1) approaches the Tracey-Widom law of order 1, which is defined in terms of the Painlevé II differential equation and can be numerically evaluated and tabulated in software. Simulations show the approximation to be informative for n and p as small as 5. ¶ The limit is derived via a corresponding result for complex Wishart matrices using methods from random matrix theory. The result suggests that some aspects of large p multivariate distribution theory may be easier to apply in practice than their fixed p counterparts.
Publié le : 2001-04-14
Classification:  Karhunen–Loève transform,  Laguerre ensemble,  empirical orthogonal functions,  largest eigenvalue,  largest singular value,  Laguerre polynomial,  Wishart distribution,  Plancherel–Rotach asymptotics,  Painlevé equation,  Tracy–Widom distribution,  random matrix theory,  Fredholm determinant,  Liouville–Green method,  62H25,  62F20,  33C45,  60H25
@article{1009210544,
     author = {Johnstone, Iain M.},
     title = {On the distribution of the largest eigenvalue in principal
			 components analysis},
     journal = {Ann. Statist.},
     volume = {29},
     number = {2},
     year = {2001},
     pages = { 295-327},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1009210544}
}
Johnstone, Iain M. On the distribution of the largest eigenvalue in principal
			 components analysis. Ann. Statist., Tome 29 (2001) no. 2, pp.  295-327. http://gdmltest.u-ga.fr/item/1009210544/