Consider a class $\mathscr{P}={P_\theta:\theta\in\Theta}$ of probability measures on a measurable space $(\mathscr{X},\mathscr{A})$, dominated by a $\sigma$ -finite measure $\mu$. Let $f_\theta=dP_\theta/d_\mu$, $\theta\ in\Theta$, and let $\theta_n$ be a maximum likelihood estimator based on n independent observations from $P_{\theta_0}$, $\theta_0\in\Theta$. We use results from empirical process theory to obtain convergence for the Hellinger distance $h(f_{\hat{\theta}_n}, f_{\theta_0})$, under certain entropy conditions on the class of densities ${f_\theta:\theta\in\Theta}$ The examples we present are a model with interval censored observations, smooth densities, monotone densities and convolution models. In most examples, the convexity of the class of densities is of special importance.