# Observation Models

All observation models define a *likelihood* for producing data \(x_n\) from some cluster-specific density with parameter \(\phi_k\):

\[p(x | \phi, z) = \prod_{n=1}^N p( x_n | \phi_k )^{\delta_k(z_{n})}\]

Supported Bayesian methods require specifying a (conjugate) prior:

\[p(\phi) = \prod_{k=1}^K p(\phi_k)\]

## Variational methods for observation models

The links below describe the mathematical and computational details for performing standard variational optimization for supported observation models: