Let $(X, Y), (X_1, Y_1), \cdots, (X_n, Y_n)$ be independent identically distributed random vectors from $R^d \times R$, and let $E(|Y|^p) < \infty$ for some $p \geq 1$. We wish to estimate the regression function $m(x) = E(Y \mid X = x)$ by $m_n(x)$, a function of $x$ and $(X_1, Y_1), \cdots, (X_n, Y_n)$. For large classes of kernel estimates and nearest neighbor estimates, sufficient conditions are given for $E\{|m_n(x) - m(x)|^p\} \rightarrow 0$ as $n \rightarrow \infty$, almost all $x$. No additional conditions are imposed on the distribution of $(X, Y)$. As a by-product, just assuming the boundedness of $Y$, the almost sure convergence to 0 of $E\{|m_n(X) - m(X)\| X_1, Y_1, \cdots, X_n, Y_n\}$ is established for the same estimates. Finally, the weak and strong Bayes risk consistency of the corresponding nonparametric discrimination rules is proved for all possible distributions of the data.