Linear prediction theory for a stationary sequence $X_n$ ordinarily begins with the assumption that the covariance $R(n) = E(X_{m + n} \bar{X}_m) = \int^\pi_{-\pi} e^{i\lambda n} dF(\lambda)$ is known. The best linear predictor of $X_0$ given the past $X_{-1}, X_{-2}, \cdots$ is then the projection $\psi$ of $X_0$ on the span of $X_{-1}, X_{-2}, \cdots$. The prediction error is $E(|X_0 - \psi|^2)$. In practice $R$ is not known but is estimated from the past. If the process is ergodic and the entire past is known this causes no problem since then the estimate $\hat{R}$ of $R$ must equal $R$. But if the process is not ergodic then $\hat{R}$ does not equal $R$. In this paper we consider the relationship between prediction using $\hat{R}$ and $R$. One conclusion is that if the process is Gaussian, it doesn't matter whether $\hat{R}$ or $R$ is used in constructing the best linear predictor. The predictor is the same and the prediction error is the same.