Adding (Gaussian) mixture random variables #104

bartvanerp · 2020-07-28T12:51:26Z

Example signal model

Consider the probabilistic model:

X ~ GMM(z_x, mu_x1, lambda_x1, ..., mu_xN, lambda_xN) 
Y ~ GMM(z_y, mu_y1, lambda_y1, ..., mu_yM, lambda_yM) 
Z = X + Y

where X and Y are distributed as (Gaussian) mixtures with N and M number of clusters, respectively. Suppose in this case that Z is observed and that all the means and precisions in the mixture model are also known. The goal is to infer X and Y including the posterior class probabilities. From a theoretical point of view the random variable Z can be represented as a Gaussian mixture model with NM clusters, where each cluster corresponds to the sum of one of the clusters in X and one of the clusters in Y. This relationship has been derived for example in https://stats.stackexchange.com/questions/174791/sum-of-gaussian-mixture-and-gaussian-scale-mixture.

ForneyLab implementation

The implementation of the probabilistic model in ForneyLab is rather straightforward using the available Gaussian mixture node.
Using variational message passing the messages can be derived for inferring the latent variables X and Y. The Gaussian messages currently flowing out of the Gaussian mixture nodes correspond to the weighted sum of the individual mixture components.

Problem

The variational messages flowing out of the Gaussian mixture node can introduce a significant bias in determining the posterior class probabilities and consequently in the latent states X and Y. This bias seems to be determined by the priors of the Gaussian mixture models (class probabilities, means, precisions/variances). Especially in higher dimensional spaces with non-uniform prior class probabilities, this bias significantly deteriorates the performance of the latent state tracking.

Affected signal models

The example above represents a simple case where the problem occurs. However, this problem occurs for any model where latent random variables are added, which each are related to some sort of mixture models. The addition of 'switching models' is therefore also affected by the problem. This extends past the class of Gaussian messages. Intuitively, all models affected are in the form:

p(z | x_1, ..., x_D) = delta(z - sum(x_i))
p(x_i | {some set of parameters}) = ...
{one of these parameters} ~ mixture model(...)

At least 2 variables should be somehow related to a mixture model.
This problem might even extend to arbitrary conditional distributions, where (at least) two conditional arguments are mixture models. This claim has not been verified.

Workaround

For the simple example at hand, the problem is relieved by first updating the posterior class probabilities once using an external function and by using these posterior class probabilities in ForneyLab for determining the hidden states.

for n = 1:N
        for m = 1:M
            log_p_posterior[n,m] = log(p(z_xn)) + log(p(z_ym)) + logpdf(Gaussian(Z | mu_xn+mu_ym, 1/(1/lambda_xn+1/lambda_ym)))
        end
end
z_posterior = exp.(log_p_posterior) ./ sum(exp.(log_p_posterior))

Normalization over its dimensions, leads to the posterior class probabilities of z_x and z_y. Multiple additions would increase the number of loops.

Required ForneyLab adaptations

For the implementation of these kinds of problems in ForneyLab, the implementation of (Gaussian) mixture messages is the most straightforward. For the addition node from the example the mixture message from Z should be automatically expanded based on the incoming messages. Backward messages to the individual edges (e.g. X), however, would likely require more advanced update mechanisms, such that the Gaussian mixture message of Z is properly decomposed.
One major downside of this approach is in its computational complexity. It would require some sort of node broadcasting, such that the individual mixture components are processed individually.
Furthermore, multiple additions would result in inference which no longer evolves linearly in complexity O(NM). In Optimal Mixture Approximation of the Product of Mixtures - Schrempf some methods are proposed to reduce the complexity, which are based on the sparsity of the mixture model. In ALGONQUIN: Iterating Laplace’s Method to Remove Multiple Types of Acoustic Distortion for Robust Speech Recognition - Frey 2001 and Super-human multi-talker speech recognition: A graphical modelling approach - Hershey 2010 the authors claim to have derived/be working on a version which is linear in the number of additions, O(N+M) instead of O(NM). However, it seems that no follow-up papers have been published to verify this claim.

The text was updated successfully, but these errors were encountered:

bartvanerp added the enhancement New feature or request label Jul 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding (Gaussian) mixture random variables #104

Adding (Gaussian) mixture random variables #104

bartvanerp commented Jul 28, 2020 •

edited

Adding (Gaussian) mixture random variables #104

Adding (Gaussian) mixture random variables #104

Comments

bartvanerp commented Jul 28, 2020 • edited

Example signal model

ForneyLab implementation

Problem

Affected signal models

Workaround

Required ForneyLab adaptations

bartvanerp commented Jul 28, 2020 •

edited