Consider the usual decision theoretic situation where one observes a random vector $X$ from which an estimate of its classification $\theta \in \{0, 1\}$ is to be made. If one knows the a priori probabilities for $\theta$ and the conditional densities of $X$ given $\theta$ then the smallest probability of error which can be achieved is called the Bayes risk and denoted by $R^\ast$. Assuming that the a priori probabilities and conditional densities are unknown we consider the problem of estimating $R^\ast$ from the independent observations $(X_1, \theta_1),\cdots, (X_n, \theta_n)$. Suppose $X$ has an unknown classification $\theta$ where $(X, \theta)$ is independent of the observations $(X_1, \theta_1),\cdots, (X_n, \theta_n)$. If $\{\delta_n\}$ is a sequence of decision procedures, where $\delta_n$ determines the estimate of $\theta$ from $X$ and $(X_1, \theta_1),\cdots, (X_n, \theta_n)$, then the notion of a deleted estimate of $R^\ast$ with $\delta_n$ is introduced and, under mild assumptions, is shown to be a consistent estimate of $R^\ast$.