The present paper is partly expository and does not assume any previous knowledge of information theory or coding theory. It is meant to be the first in a series of papers on coding theory for noisy channels, this series replacing the report [12] which was widely circulated. A few of the results were reported in [11], [27] and [28]. In Sections 2 and 3 we present certain refinements and generalizations of known methods of Shannon, Fano, Feinstein and Gallager. A discussion of some other methods may be found in the surveys by Kotz [15] and Wolfowitz [28] such as those of Khintchine [14], McMillan [18] and Wolfowitz [25], [26]. Section 4 contains a number of applications. Most of this section is devoted to a certain memoryless channel with additive noise. Some of the proofs have been collected in Section 5. Finally, Section 6 describes some new results on the relative entropy $H(\mu_1\mid\mu_2),$ of one measure $\mu_1$ relative to another measure $\mu_2$, and its relation with the total variation $\|\mu_1 - \mu_2\|$.