Missing Data in Hierarchical Classification - a study
Lorga da Silva, Ana ; Bacelar-Nicolau, Helena ; Saporta, Gilbert
HAL, hal-01124655 / Harvested from HAL
We analyse the effect of missing data in hierarchical classification of variables according to the following factors: amount of missing data, imputation techniques, similarity coefficient, and aggregation criterion. We have used two methods of imputation, a regression method using an ordinary-least squares method and an EM algorithm. For the similarity matrices we have used the basic affinity coefficient and the Pearson's correlation coefficient. As aggregation criteria we apply average linkage, single linkage and complete linkage methods. To compare the structure of the hierarchical classifications the Spearman's coefficient between the associated ultrametrics has been used. We present here simulation experiments in two multivariate normal cases.
Publié le : 2001-01-01
Classification:  [INFO]Computer Science [cs],  [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]
@article{hal-01124655,
     author = {Lorga da Silva, Ana and Bacelar-Nicolau, Helena and Saporta, Gilbert},
     title = {Missing Data in Hierarchical Classification - a study},
     journal = {HAL},
     volume = {2001},
     number = {0},
     year = {2001},
     language = {en},
     url = {http://dml.mathdoc.fr/item/hal-01124655}
}
Lorga da Silva, Ana; Bacelar-Nicolau, Helena; Saporta, Gilbert. Missing Data in Hierarchical Classification - a study. HAL, Tome 2001 (2001) no. 0, . http://gdmltest.u-ga.fr/item/hal-01124655/