Assume one observes independent categorical variables or, equivalently, one observes the corresponding multinomial variables. Estimating the distribution of the observed sequence amounts to estimating the expectation of the multinomial sequence. A new estimator for this mean is proposed that is nonparametric, non-asymptotic and implementable even for large sequences. It is a penalized least-squares estimator based on wavelets, with a penalization term inspired by papers of Birgé and Massart. The estimator is proved to satisfy an oracle inequality and to be adaptive in the minimax sense over a class of Besov bodies. The method is embedded in a general framework which allows us to recover also an existing method for segmentation. Beyond theoretical results, a simulation study is reported and an application on real data is provided.