The Expectation Maximization(EM) algorithm estimates the parameters of the multivariate probability density function in the form of a Gaussian mixture distribution with a specified number of mixtures.
Consider the set of the N feature vectors
{  } from a d-dimensional Euclidean space drawn from a Gaussian mixture:
 } from a d-dimensional Euclidean space drawn from a Gaussian mixture:


where
 is the number of mixtures,
 is the number of mixtures,
 is the normal distribution
density with the mean
 is the normal distribution
density with the mean
 and covariance matrix
 and covariance matrix
 ,
,
 is the weight of the k-th mixture. Given the number of mixtures
 is the weight of the k-th mixture. Given the number of mixtures
 and the samples
 and the samples
 ,
,
 the algorithm finds the
maximum-likelihood estimates (MLE) of all the mixture parameters,
that is,
 the algorithm finds the
maximum-likelihood estimates (MLE) of all the mixture parameters,
that is,
 ,
,
 and
 and
 :
 :


The EM algorithm is an iterative procedure. Each iteration includes
two steps. At the first step (Expectation step or E-step), you find a
probability
 (denoted
 (denoted
 in the formula below) of
sample i to belong to mixture k using the currently
available mixture parameter estimates:
 in the formula below) of
sample i to belong to mixture k using the currently
available mixture parameter estimates:

At the second step (Maximization step or M-step), the mixture parameter estimates are refined using the computed probabilities:

Alternatively, the algorithm may start with the M-step when the initial values for
 can be provided. Another alternative when
 can be provided. Another alternative when
 are unknown is to use a simpler clustering algorithm to pre-cluster the input samples and thus obtain initial
 are unknown is to use a simpler clustering algorithm to pre-cluster the input samples and thus obtain initial
 . Often (including machine learning) the
k-means algorithm is used for that purpose.
 . Often (including machine learning) the
k-means algorithm is used for that purpose.
One of the main problems of the EM algorithm is a large number
of parameters to estimate. The majority of the parameters reside in
covariance matrices, which are
 elements each
where
 elements each
where
 is the feature space dimensionality. However, in
many practical problems, the covariance matrices are close to diagonal
or even to
 is the feature space dimensionality. However, in
many practical problems, the covariance matrices are close to diagonal
or even to
 , where
 , where
 is an identity matrix and
 is an identity matrix and
 is a mixture-dependent “scale” parameter. So, a robust computation
scheme could start with harder constraints on the covariance
matrices and then use the estimated parameters as an input for a less
constrained optimization problem (often a diagonal covariance matrix is
already a good enough approximation).
 is a mixture-dependent “scale” parameter. So, a robust computation
scheme could start with harder constraints on the covariance
matrices and then use the estimated parameters as an input for a less
constrained optimization problem (often a diagonal covariance matrix is
already a good enough approximation).
References:
The class implements the EM algorithm as described in the beginning of this section.
The constructor
| Parameters: | 
 | 
|---|
Creates empty EM model
| Parameters: | 
 | 
|---|
The model should be trained then using StatModel::train(traindata, flags) method. Alternatively, you can use one of the EM::train* methods or load it from file using StatModel::load<EM>(filename).
Static methods that estimate the Gaussian mixture parameters from a samples set
| Parameters: | 
 | 
|---|
Three versions of training method differ in the initialization of Gaussian mixture model parameters and start step:
 of mixture components. Optionally you can pass initial weights
 of mixture components. Optionally you can pass initial weights  and covariance matrices
 and covariance matrices  of mixture components.
 of mixture components. to use this option.
 to use this option.The methods return true if the Gaussian mixture model was trained successfully, otherwise it returns false.
Unlike many of the ML models, EM is an unsupervised learning algorithm and it does not take responses (class labels or function values) as input. Instead, it computes the
Maximum Likelihood Estimate of the Gaussian mixture parameters from an input sample set, stores all the parameters inside the structure:
 in probs,
 in probs,
 in means ,
 in means ,
 in covs[k],
 in covs[k],
 in weights , and optionally computes the output “class label” for each sample:
 in weights , and optionally computes the output “class label” for each sample:
 (indices of the most probable mixture component for each sample).
 (indices of the most probable mixture component for each sample).
The trained model can be used further for prediction, just like any other classifier. The trained model is similar to the NormalBayesClassifier.
Returns a likelihood logarithm value and an index of the most probable mixture component for the given sample.
| Parameters: | 
 | 
|---|
The method returns a two-element double vector. Zero element is a likelihood logarithm value for the sample. First element is an index of the most probable mixture component for the given sample.
Returns the cluster centers (means of the Gaussian mixture)
Returns matrix with the number of rows equal to the number of mixtures and number of columns equal to the space dimensionality.