We consider the situation where one has to maximize a function
$\eta(\theta, \mathbf{x})$ with respect to $\mathbf{x} \epsilon \mathbb{R}^q$,
when $\theta$ is unknown and estimated by least squares through observations
$y_k = \mathbf{f}^{\top(\mathbf{x}_k)\theta + \varepsilon_k$, with
$\varepsilon_k$ some random error. Classical applications are regulation and
extremum control problems. The approach we adopt corresponds to maximizing the
sum of the current estimated objective and a penalization for poor estimation:
$\mathbf{x}_{k + 1}$ maximizes $\eta(\hat{\theta}^k, \mathbf{x}) +
(\alpha_k/k), d_k(\mathbf{x})$, with $\hat{\theta}^k$ the estimated value of
$\theta$ at step $k$ and $d_k$ the penalization function. Sufficient conditions
for strong consistency of $\hat{\theta}^k$ and for almost sure convergence of
$(1/k) \Sigma_{i=1}^k \eta(\theta, \mathbf{x}_i)$ to the maximum value of
$\eta(\theta, \mathbf{x})$ are derived in the case where $d_k(\cdot)$ is the
variance function used in the sequential construction of $D$-optimum designs. A
classical sequential scheme from adaptive control is shown not to satisfy these
conditions, and numerical simulations confirm that it indeed has convergence
problems.