A High Dimensional Two Sample Significance Test
Dempster, A. P.
Ann. Math. Statist., Tome 29 (1958) no. 4, p. 995-1010 / Harvested from Project Euclid
The classical multivariate 2 sample significance test based on Hotelling's $T^2$ is undefined when the number $k$ of variables exceeds the number of within sample degrees of freedom available for estimation of variances and covariances. Addition of an a priori Euclidean metric to the affine $k$-space assumed by the classical method leads to an alternative approach to the same problem. A test statistic $F$ which is the ratio of 2 mean square distances is proposed and 3 methods of attaching a significance level to $F$ are described. The third method is considered in detail and leads to a "non-exact" significance test where the null hypothesis distribution of $F$ depends, in approximation, on a single unknown parameter $r$ for which an estimate must be substituted. Approximate distribution theory leads to 2 independent estimates of $r$ based on nearly sufficient statistics and these may be combined to yield a single estimate. A test of $F$ nominally at the 5% level but based on an estimate of $r$ rather than $r$ itself has a true significance level which is a function of $r$. This function is investigated and shown to be quite near 5%. The sensitivity of the test to a parameter measuring statistical distance between population means is discussed and it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or, more precisely, $r$) can be made sufficiently large. This sensitivity discussion has stated implications for the a priori choice of metric in $k$-space. Finally a geometrical description of the case of large $r$ is presented.
Publié le : 1958-12-14
Classification: 
@article{1177706437,
     author = {Dempster, A. P.},
     title = {A High Dimensional Two Sample Significance Test},
     journal = {Ann. Math. Statist.},
     volume = {29},
     number = {4},
     year = {1958},
     pages = { 995-1010},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1177706437}
}
Dempster, A. P. A High Dimensional Two Sample Significance Test. Ann. Math. Statist., Tome 29 (1958) no. 4, pp.  995-1010. http://gdmltest.u-ga.fr/item/1177706437/