Let $\mathbf{X} = (X_1, \cdots, X_m)'$ and $\mathbf{Y} = (Y_1, \cdots, Y_n)'$ be two random vectors. Given any random vector $\mathbf{Z}$, let $\mathbf{Y}^\ast_Z$ be the best linear predictor of $\mathbf{Y}$ based on $\mathbf{Z}$. Let $p$ be any natural number smaller than $m$. We consider the problem of finding the $p$-dimensional random vector $\mathbf{Z} = (Z_1, \cdots, Z_p)'$ where each component $Z_i$ is a linear function of $\mathbf{X}$, which minimizes the determinant of $E(\mathbf{Y} - \mathbf{Y}^\ast_Z)(\mathbf{Y} - \mathbf{Y}^\ast_Z)'$. We show that $Z_1, \cdots, Z_p$ coincide with the first $p$ canonical variables (except for a nonsingular linear transformation). We also show that the square of the $(p + 1)$th canonical correlation coefficient measures the relative improvement in the prediction of $\mathbf{Y}$ when $p + 1 Z_i$'s are used instead of $p$.