Vardi (1985a) introduced an $s$-sample model for biased sampling, gave conditions which guarantee the existence and uniqueness of the nonparametric maximum likelihood estimator $\mathbb{G}_n$ of the common underlying distribution $G$ and discussed numerical methods for calculating the estimator. Here we examine the large sample behavior of the NPMLE $\mathbb{G}_n$, including results on uniform consistency of $\mathbb{G}_n$, convergence of $\sqrt n (\mathbb{G}_n - G)$ to a Gaussian process and asymptotic efficiency of $\mathbb{G}_n$ as an estimator of $G$. The proofs are based upon recent results for empirical processes indexed by sets and functions and convexity arguments. We also give a careful proof of identifiability of the underlying distribution $G$ under connectedness of a certain graph $\mathbf{G}$. Examples and applications include length-biased sampling, stratified sampling, "enriched" stratified sampling, "choice-based" sampling in econometrics and "case-control" studies in biostatistics. A final section discusses design issues and further problems.
@article{1176350948,
author = {Gill, Richard D. and Vardi, Yehuda and Wellner, Jon A.},
title = {Large Sample Theory of Empirical Distributions in Biased Sampling Models},
journal = {Ann. Statist.},
volume = {16},
number = {1},
year = {1988},
pages = { 1069-1112},
language = {en},
url = {http://dml.mathdoc.fr/item/1176350948}
}
Gill, Richard D.; Vardi, Yehuda; Wellner, Jon A. Large Sample Theory of Empirical Distributions in Biased Sampling Models. Ann. Statist., Tome 16 (1988) no. 1, pp. 1069-1112. http://gdmltest.u-ga.fr/item/1176350948/