Two algorithms are compared for maximizing the likelihood function associated with parameter estimation in partially observed diffusion processes: 1) the EM algorithm, an iterative algorithm where, at each iteration, an auxiliary function is computed and maximized; 2) the direct approach where the likelihood function itself is computed and maximized. This yields to a comparison of nonlinear smoothing and nonlinear filtering for computing a class of conditional expectations related to the problem of estimation. It is shown that smoothing is indeed necessary for the EM algorithm approach to be efficient. Time discretization schemes for the stochastic PDÉs involved in the algorithms are given, and the link with the discrete time case (hidden Markov model) is explored.