In the purest Bayesian interpretation, we are required to keep the entire posterior distribu- tion over the parameters all the way until prediction, to come up with the posterior predictive distribution, and the final prediction will be the expected value of the posterior predictive dis- tribution. However in most situations, this is computationally very expensive, and we settle for a compromise that is less pure (in the Bayesian sense). ———PS2.3