Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on how to compute normalized mutual information for discrete and continuous data #25

Open
ivan-marroquin opened this issue Dec 10, 2021 · 2 comments

Comments

@ivan-marroquin
Copy link

Hi Greg,

Many thanks for making available such great Python code!

I was wondering if you could provided suggestions on how to compute normalized mutual information for discrete and continuous data. I would expect the normalized version of mutual information to be in the range [0, 1].

Kind regards,
Ivan

@gregversteeg
Copy link
Owner

That makes sense. If X is continuous and Z is discrete, then I(X;Z) = H(Z) - H(Z|X) <= H(Z), where H is Shannon entropy and is always non-negative. So using I(X;Z) / H(Z) is probably your best best for a normalized quantity.

For estimation, you will have to use the "micd" estimator (mutual information between continuous and discrete) which does this though, I(X;Z) = h(X) - h(X|Z), where h is differential entropy estimated using NPEET. My one worry is that because the error for the two terms could be different, you may sometimes get quantities outside your desired range. Depending on your scenario, you might deal with it in different ways. You could clip the values, or try to do bootstrap sampling to get a range of possible values.

@ivan-marroquin
Copy link
Author

Hi Greg,

Thanks for the prompt answer and explanation. So, I assume that if I only have continuous data the normalized mutual information can be computed using I(X;Z) / H(Z) . So, in this case I use your continuous estimator for mutual information. On the other hand, if I only have discrete data. Then I compute the mutual information using the discrete version of the estimator. Am I correct?

Ivan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants