Factor Analysis

The generative model for factor analysis assumes that the data was produced in three stages:

  1. Pick values independently for some hidden factors that have Gaussian priors.
  2. Linearly combine the factors using a factor loading matrix. Use more linear combinations than factors.
  3. Add Gaussian noise that is different for each input.
In [12]:
%matplotlib inline
from matplotlib import rc
rc("font", family="serif", size=16)
rc("text", usetex=True)
import daft
def plot_FA():
    pgm = daft.PGM([6.3, 4.05], origin=[-1., -0.3], aspect=1.)
    pgm.add_node(daft.Node("x1", r"$x_1$", 1.5, 1, observed=True))
    pgm.add_node(daft.Node("x2", r"$x_2$", 2.5, 1, observed=True))
    pgm.add_node(daft.Node("x3", r"$x_3$", 3.5, 1, observed=True))
    pgm.add_node(daft.Node("z1", r"$z_1$", 2., 2.2))
    pgm.add_node(daft.Node("z2", r"$z_2$", 3, 2.2))
    # Add in the edges.

    pgm.add_edge("z1", "x1", directed=True)
    pgm.add_edge("z1", "x2", directed=True)
    pgm.add_edge("z1", "x3", directed=True)
    pgm.add_edge("z2", "x1", directed=True)
    pgm.add_edge("z2", "x2", directed=True)
    pgm.add_edge("z2", "x3", directed=True)

    #TODO: add hyperparameters
In [13]:
  • observation: $\vec x$
  • latent factors: $\vec z$
  • factor loading matrix $W$
  • diagonal matrix $\Psi$

Generative Model: $$ \vec x = W \vec z + \vec u $$

$$ \vec z \sim \mathcal N(0,\mathbb{1}) $$$$ \vec u \sim \mathcal N(0,\Psi) $$

$\vec x$ is distributed with zero mean and covariance $W W^T + \Psi$.

The goal of factor analysis is to find the matricies $W$ and $\Psi$ that best explain the covariance structure of all observations $\vec x$.


  • $\rm{dim}(\vec z) < \rm{dim} (\vec x)$
  • diagonality of $\Psi$ is one of the key assumptions of factor analysis

Further Readings