The Expectation Maximization(EM) algorithm estimates the parameters of the multivariate probability density function in the form of a Gaussian mixture distribution with a specified number of mixtures.
Consider the set of the N feature vectors
{ } from a d-dimensional Euclidean space drawn from a Gaussian mixture:
where
is the number of mixtures,
is the normal distribution
density with the mean
and covariance matrix
,
is the weight of the k-th mixture. Given the number of mixtures
and the samples
,
the algorithm finds the
maximum-likelihood estimates (MLE) of all the mixture parameters,
that is,
,
and
:
The EM algorithm is an iterative procedure. Each iteration includes
two steps. At the first step (Expectation step or E-step), you find a
probability
(denoted
in the formula below) of
sample i to belong to mixture k using the currently
available mixture parameter estimates:
At the second step (Maximization step or M-step), the mixture parameter estimates are refined using the computed probabilities:
Alternatively, the algorithm may start with the M-step when the initial values for
can be provided. Another alternative when
are unknown is to use a simpler clustering algorithm to pre-cluster the input samples and thus obtain initial
. Often (including macnine learning) the
kmeans() algorithm is used for that purpose.
One of the main problems of the EM algorithm is a large number
of parameters to estimate. The majority of the parameters reside in
covariance matrices, which are
elements each
where
is the feature space dimensionality. However, in
many practical problems, the covariance matrices are close to diagonal
or even to
, where
is an identity matrix and
is a mixture-dependent “scale” parameter. So, a robust computation
scheme could start with harder constraints on the covariance
matrices and then use the estimated parameters as an input for a less
constrained optimization problem (often a diagonal covariance matrix is
already a good enough approximation).
References:
Parameters of the EM algorithm. All parameters are public. You can initialize them by a constructor and then override some of them directly if you want.
The constructors
Parameters: |
|
---|
The default constructor represents a rough rule-of-the-thumb:
CvEMParams() : nclusters(10), cov_mat_type(1/*CvEM::COV_MAT_DIAGONAL*/),
start_step(0/*CvEM::START_AUTO_STEP*/), probs(0), weights(0), means(0), covs(0)
{
term_crit=cvTermCriteria( CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 100, FLT_EPSILON );
}
With another contstructor it is possible to override a variety of parameters from a single number of mixtures (the only essential problem-dependent parameter) to initial values for the mixture parameters.
The class implements the EM algorithm as described in the beginning of this section.
Estimates the Gaussian mixture parameters from a sample set.
Parameters: |
|
---|
Unlike many of the ML models, EM is an unsupervised learning algorithm and it does not take responses (class labels or function values) as input. Instead, it computes the
Maximum Likelihood Estimate of the Gaussian mixture parameters from an input sample set, stores all the parameters inside the structure:
in probs,
in means ,
in covs[k],
in weights , and optionally computes the output “class label” for each sample:
(indices of the most probable mixture component for each sample).
The trained model can be used further for prediction, just like any other classifier. The trained model is similar to the CvNormalBayesClassifier.
For an example of clustering random samples of the multi-Gaussian distribution using EM, see em.cpp sample in the OpenCV distribution.
Returns a mixture component index of a sample.
Parameters: |
|
---|
Returns the number of mixture components in the gaussian mixture model.
Returns mixture means .
Returns mixture covariance matrices .
Returns mixture weights .
Returns vectors of probabilities for each training sample.
For each training sample (that have been passed to the constructor or to CvEM::train()) returns probabilites
to belong to a mixture component
.
Returns logarithm of likelihood.
Returns difference between logarithm of likelihood on the last iteration and logarithm of likelihood on the previous iteration.
Writes used parameters of the EM algorithm to a file storage.
Parameters: |
|
---|
Reads parameters of the EM algorithm.
Parameters: |
|
---|
The function reads EM parameters from the specified file storage node. For example of clustering random samples of multi-Gaussian distribution using EM see em.cpp sample in OpenCV distribution.