We consider, simultaneously, $N$ statistical decision problems with identical generic structure: state space $\Omega$, action space $A$, sample space $\mathscr{X}$ and nonnegative loss function $L$ defined on $\Omega \times A \times \mathscr{X}$. With $x = (x_1, \cdots, x_N)$ distributed according to $\prod^N_{i=1} \mathbf{P}_{\theta_i} = \mathbf{P}_\theta$, a compound procedure is a vector, $\mathbf{\phi} = (\phi_1, \cdots, \phi_N)$, such that $\phi_i(\mathbf{x}) \in A$ for each $i$ and $\mathbf{x}$. The risk of the procedure $\mathbf{\phi}$ is $\mathbf{R}(\mathbf{\theta}, \mathbf{\phi}) = N^{-1} \sum^N_{r=1} \mathbf{\int L} (\theta_r, \phi_r(\mathbf{x}), x_r)\mathbf{P_\theta} (d\mathbf{x})$ and the modified regret is $\mathbf{D}(\mathbf{\theta, \phi}) = \mathbf{R}(\mathbf{\theta, \phi}) - R(G)$ where $G$ is the empirical distribution of $\theta_1, \cdots, \theta_N$, and $R(G)$ is the Bayes risk against $G$ in the component problem. We discuss quite wide classes of procedures, $\mathbf{\phi}$, which consist of using $\mathbf{x}$ to estimate $G$, and then playing $\varepsilon$-Bayes against the estimate in each component problem. For one class we establish a type of uniform convergence of the conditional risk in the $m \times n$ problem (i.e. $\Omega$ has $m$ elements, $A$ has $n$), and use this to get $\mathbf{D}(\mathbf{\theta, \phi}) < \varepsilon + o(1)$ for another class in the $m \times n$ and $m \times \infty$ problems. Similar, but weaker, results are given in part II for the case when $\Omega$ is infinite.