Within bnpy, every hierarchical model we support has two pieces: an allocation model and an observation model. We use the label “allocation model” to describe the generative process that allocates cluster assignments to individual data points.
TODO ILLUSTRATION
In this document, we give a highlevel overview of how we define an allocation model and how variational inference works. We also define the essential variational inference API functions that any concrete allocation model (an instance of the abstract AllocModel
class) should support.
Here are some quick links to documentation for each of the possible allocation models supported by bnpy.
An allocation model defines a probabilistic generative process for assigning (aka allocating) clusters to data atoms. There are two types of variables involved: cluster probability vectors $pi_j$, and discrete assignments $z_n$ at each data aton indexed by $n$. Each allocation model defines a joint distribution
First, we generate a set of global cluster probabilities $pi_0$.
Depending on the model, we may next generate several more cluster probability vectors $pi_j$.
Second, we draw cluster assignment variables $z_n$ at each data atom $n$.
For example, consider a simple finite mixture model with $K$ clusters. The complete allocation model would be:
To extend this to a Dirichlet process mixture model, we simply use a stickbreaking distribution instead:
Variational inference for allocation models tries to optimize an approximate posterior:
The optimization objective is to make this approximate posterior as close to the true posterior as possible. Remember that this objective incorporates terms from the observation model as well. The optimization finds values for the free parameters – pseudocounts theta and assignments r – that make the objective function as large as possible.
Expanding the allocation model terms, we have
Every variational algorithm proceeds by iteratively improving this objective function by cycling through four concrete steps:
Within bnpy, each possible allocation model is a subclass of the generalpurpose abstract base class: AllocModel
.
Each AllocModel
instance has both state and behaviors.
The state represents two key values: the hyperparameters that define the prior and the global variational parameters that define the approximate posterior. The behaviors are the four fundamental steps of inference, as well as some auxiliary functions.
For any generative model in our framework, the hyperparameters of an allocation model are just the set of concentration parameters $alpha_j$ that parameterize the generative story for each $pi_j$ probability vector. Thus, each allocation model will hold one or more alpha values as attributes.
Each AllocModel
subclass will have modelspecific global parameters, which are represented as instance attributes. For example, a FiniteMixtureModel
has a vector of Dirichlet pseudocounts called theta, while a DPMixtureModel
instance has a vector of Beta pseudocounts called eta.
Each of the four conceptual steps of the variational inference – local step, summary step, global step, and objective step – is associated with a single instancelevel function of an AllocModel object. The general abstract interface for using these functions is documented below. Each subclass will provide an actual implementation of these functions.
The local step, specified by calc_local_params, finds local parameters for the dataset.
bnpy.allocmodel.
AllocModel
(inferType)[source]¶calc_local_params
(Data, LP)[source]¶Compute local parameters for each data item and component.
This is the Estep of EM algorithm.
Returned LP contains optimal values of local parameters specific to the provided dataset. Updated values computed using current global parameter attributes.
Possible keyword arguments control modelspecific computations.
Parameters: 


Returns:  LP (dict) – Contains updated fields for all K clusters in current model. * ‘resp’ : N x K 2D array, soft assignments for each data atom. 
The summary step, specified by get_global_suff_stats, summarizes a dataset Data and its associated local parameters LP. It produces a bag of sufficient statistics SS.
bnpy.allocmodel.
AllocModel
(inferType)[source]get_global_suff_stats
(Data, SS, LP, **kwargs)[source]¶Compute lowdim summaries for provided local params.
Returned sufficient statistics are deterministic given Data, LP.
Possible keyword arguments control modelspecific computations.
Parameters: 


Returns:  SS ( 
The global step, performed by update_global_params,
bnpy.allocmodel.
AllocModel
(inferType)[source]get_global_suff_stats
(Data, SS, LP, **kwargs)[source]Compute lowdim summaries for provided local params.
Returned sufficient statistics are deterministic given Data, LP.
Possible keyword arguments control modelspecific computations.
Parameters: 


Returns:  SS ( 
During inference, we need to verify that each step is working as expected. Thus, we need to be able to compute the scalar value of the objective given any current set of global parameters (stored in self) and local parameters (summarized in SS).
bnpy.allocmodel.
AllocModel
(inferType)[source]calc_evidence
(Data, SS, LP, todict=0, **kwargs)[source]¶Calculate ELBO objective function value for provided state.
Parameters: 


Keyword Arguments:  
todict (boolean) – If True, return a dict with different ELBO terms
If False [default], return scalar value equal to sum of terms. 

Returns:  L (float) – Represents sum of all terms in optimization objective. Will be a dict if todict option is True. 