Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative mutual information after using shuffle (but correct trend) #17

Open
qiongxiu opened this issue Apr 10, 2020 · 4 comments
Open

Comments

@qiongxiu
Copy link

Dear Greg,

I am using npeet for estimating mutual information in distributed least squares problem, but it seems I often get negative mutual information even with the use of shuffle_test. Despite that, one interesting thing is that even most of the results are negative, the tendency seems right. As I attached in the figure, the blue line first increase and then converge, the red line is far away from blue line and then converge. This trend is what I expected, but I cannot explain the negative values, do you have any idea about this? Thanks in advance.
mutual

@gregversteeg
Copy link
Owner

I'm not sure I understand the example. What is the x-axis, and the difference between blue and red?

While mutual information should never be negative, the estimator can be negative. The reason is that estimating mutual information empirically with finite data has some bias, when we subtract out the bias we end up getting an estimator whose mean is correct, but sometimes gives negative answers. For many applications, it suffices to consider a negative MI estimate as zero.

@gregversteeg
Copy link
Owner

To add another point: the shuffle test is trying to estimate the bias of the mutual information estimator. It does so by shuffling data (to get a case that should have zero mutual information). The mutual information we get in the shuffled case is an estimate of the bias. This bias is then subtracted from our estimate on real data, sometimes leading to negative MI.

@qiongxiu
Copy link
Author

x-axis denotes the iteration, the blue and red line denote two mutual information. In theory, the blue line should be more correlated than the red line at the first few iterations and then they should converge to the same result. The trend in this plot is correct.

It seems npeet has difficulty in distinguishing very small mutual information like difference of 10^{-5} and 10^{-2}. And I think npeet does distinguish -0.5 to be less correlated than -0.1, if I just set all negative mutual information as zero, then we cannot distinguish them. I am a bit confused on how to handling this result? simply set all the negative MI as zero will lose the distinguishability.

@qiongxiu
Copy link
Author

I set N as 10000 for now, and I assume it is sufficient. Do you think I should increase the samples like 10^5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants