The Bayesian Information Criterion (BIC) estimates the order of a
Markov chain (with finite alphabet $A$) from observation of a sample path $x_1,
x_2,\dots, x_n$, as that value $k = \hat{k}$ that minimizes the sum of the
negative logarithm of the $k$th order maximum likelihood and the penalty term
$\frac{|A|^k(|A|-1)}{2}\log n$. We show that $\hat{k}$ equals the correct order
of the chain, eventually almost surely as $n \rightarrow \infty$, thereby
strengthening earlier consistency results that assumed an apriori bound on the
order. A key tool is a strong ratio-typicality result for Markov sample
paths.We also show that the Bayesian estimator or minimum description length
estimator, of which the BIC estimator is regarded as an approximation, fails to
be consistent for the uniformly distributed i.i.d. process.