An Optimal Variable Cell Histogram Based on the Sample Spacings
Kanazawa, Yuichiro
Ann. Statist., Tome 20 (1992) no. 1, p. 291-304 / Harvested from Project Euclid
Suppose we wish to construct a variable $k$-cell histogram based on an independent identically distributed sample of size $n - 1$ from an unknown density $f$ on the interval of finite length. A variable cell histogram requires cutpoints and heights of all of its cells to be specified. We propose the following procedure: (i) choose from the order statistics corresponding to the sample a set of $k + 1$ cutpoints that maximize a criterion, a function of the sample spacings; (ii) compute heights of the $k$ cells according to a formula. The resulting histogram estimates a $k$-cell theoretical histogram that stays constant within a cell and that minimizes the Hellinger distance to the density $f$. The histogram tends to estimate low density regions accurately and is easy to compute. We find the number of cells of order $n^{1/3}$ minimizes the mean Hellinger distance between the density $f$ and a class of histograms whose cutpoints are chosen from the order statistics.
Publié le : 1992-03-14
Classification:  Density estimation,  Hellinger distance,  histogram,  order statistics,  spacing,  62G05,  62E20
@article{1176348523,
     author = {Kanazawa, Yuichiro},
     title = {An Optimal Variable Cell Histogram Based on the Sample Spacings},
     journal = {Ann. Statist.},
     volume = {20},
     number = {1},
     year = {1992},
     pages = { 291-304},
     language = {en},
     url = {http://dml.mathdoc.fr/item/1176348523}
}
Kanazawa, Yuichiro. An Optimal Variable Cell Histogram Based on the Sample Spacings. Ann. Statist., Tome 20 (1992) no. 1, pp.  291-304. http://gdmltest.u-ga.fr/item/1176348523/