Bayesian Data Analysis

Process of Bayesian Data Analysis can be idealized by dividing it into three steps:

  1. Setting up a (full) probability model.
  2. Conditioning on observed data: calculating and interpreting the appropriate posterior distribution.
  3. Evaluating the fit of the model and the implications of the resulting posterior distributions.

Direct quantification of uncertainty: "Parameters" of the model are random variables (hidden variables).

$$ P(\theta \mid \mathcal{D}) = \frac{P(\mathcal{D},\theta)}{p(\mathcal{D})} = \frac{P(\mathcal{D} \mid \theta) P(\theta)}{P(\mathcal{D})} = \frac{P(\mathcal{D} \mid \theta) P(\theta)}{\int_\theta P(\mathcal{D},\theta')d\theta'}= \frac{P(\mathcal{D} \mid \theta) P(\theta)}{\int_\theta P(\mathcal{D}|\theta')P(\theta')d\theta'} $$


Predict new data $\mathcal{\bar D}$ based on observed data $\mathcal{\bar D}$: $$ P(\mathcal{\bar D} \mid \mathcal{D}) = \int_\Theta P(\mathcal{\bar D} \mid \theta) P(\theta \mid \mathcal{D}) d\theta $$

  • Averaging preditions $P(\mathcal{\bar D} \mid \theta)$, weighted by posterior $P(\theta \mid \mathcal{D})$
  • $\Theta = \{\theta \mid P(\theta \mid \mathcal{D}) >0\} $ is support of $P(\theta \mid \mathcal{D})$


  • Objective Prior: non-informative prior; good frequentist properties
  • Subjective Prior: capture beliefs, not arbitrary
  • Hierachical Prior: multiple level of priors, e.g. $$ p(\theta) = \int p(\theta | \alpha) p(\alpha) d \alpha = \\ \int \int p(\theta | \alpha) p(\alpha |\beta) p(\beta) d \alpha d \beta $$
  • Empirical Prior: learn some of the parameters of the prior from data

Pitfalls of any Bayesian MCMC analysis:

  • non-convergence
  • the priors
  • the model


  • Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin: Bayesian Data Analysis, 3rd edition, 2013, CRC Press, ISBN-13: 978-1439840955