Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: not enough values to unpack (expected 2, got 1) in calculating the micd #24

Open
Angela446-lgtm opened this issue Sep 28, 2021 · 9 comments

Comments

@Angela446-lgtm
Copy link

I observed this ValueError: not enough values to unpack (expected 2, got 1) when I tried to calculate the mutual info between a continuous and a discrete.
Can anybody help me?

import npeet.entropy_estimators as ee
ee.micd(cont.iloc[:,1].values.tolist(),disc.iloc[:,[1]].values.tolist()))

@gregversteeg
Copy link
Owner

It could be a question of which quantities it expects to be lists of vectors, and which not. I'd try modifying where the brackets are, e.g. this:
ee.micd(cont.iloc[:,[1]].values.tolist(),disc.iloc[:,1].values.tolist()))
That way, the continuous one is a (n_samples, 1) and the discrete on is just (n_samples,). I think that's right. If it doesn't work, print out the dimensions of cont and disc, and I'll think about it a little more.

@gregversteeg
Copy link
Owner

(I'm pretty sure discrete expects just a single discrete quantity, not a vector, but continuous does expect a vector.)

@Angela446-lgtm
Copy link
Author

Yes!it works!thank you!

@fingoldo
Copy link

fingoldo commented Aug 8, 2023

(I'm pretty sure discrete expects just a single discrete quantity, not a vector, but continuous does expect a vector.)

Can you please provide a working example for micd?
I can use one for mi:

x = [[1.3], [3.7], [5.1], [2.4], [3.4]]
y = [[1.5], [3.32], [5.3], [2.3], [3.3]]
ee.mi(x, y)

0.16831442143704642

Now, according to your recommendations:

x = [[1.3], [3.7], [5.1], [2.4], [3.4]]
y = [5, 3, 5, 2, 3]
ee.micd(x, y)

C:\ProgramData\Anaconda3\lib\site-packages\npeet\entropy_estimators.py in micd(x, y, k, base, warning)
223 entropy_x_given_y = 0.0
224 for yval, py in zip(y_unique, y_proba):
--> 225 x_given_y = x[(y == yval).all(axis=1)]
226 if k <= len(x_given_y) - 1:
227 entropy_x_given_y += py * entropy(x_given_y, k, base)

C:\ProgramData\Anaconda3\lib\site-packages\numpy\core_methods.py in _all(a, axis, dtype, out, keepdims, where)
62 # Parsing keyword arguments is currently fairly slow, so avoid it for now
63 if where is True:
---> 64 return umr_all(a, axis, dtype, out, keepdims)
65 return umr_all(a, axis, dtype, out, keepdims, where=where)
66

AxisError: axis 1 is out of bounds for array of dimension 1

@gregversteeg
Copy link
Owner

Oh, how annoying! The way you called it seems more natural, but at a glance it seems like it also expects "vectors" for the discrete values y = np.array([[5], [3]...]).
Also it seems like it would only work if y is a numpy array. I can't believe I didn't just put in a check, as y = np.asarray(y) would be an efficient way to avoid problems like this.
Let me know if this works.

@fingoldo
Copy link

fingoldo commented Aug 8, 2023

ld only work if y is a numpy array. I can't believe I didn't just put in a check, as y = np.asarray(y) would be an efficient way to avoid

mm, then I get

x = [[1.3], [3.7], [5.1], [2.4], [3.4]]
y =np.array([[5], [3], [5], [2], [3]])
ee.micd(x, y)

TypeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_19412\3742637936.py in
1 x = [[1.3], [3.7], [5.1], [2.4], [3.4]]
2 y =np.array([[5], [3], [5], [2], [3]])
----> 3 ee.micd(x, y)

C:\ProgramData\Anaconda3\lib\site-packages\npeet\entropy_estimators.py in micd(x, y, k, base, warning)
223 entropy_x_given_y = 0.0
224 for yval, py in zip(y_unique, y_proba):
--> 225 x_given_y = x[(y == yval).all(axis=1)]
226 if k <= len(x_given_y) - 1:
227 entropy_x_given_y += py * entropy(x_given_y, k, base)

TypeError: only integer scalar arrays can be converted to a scalar index

@gregversteeg
Copy link
Owner

Can you make x an np.array([]) too?

@fingoldo
Copy link

fingoldo commented Aug 8, 2023

oh wait. it's actually x that has to be a numpy array. now it works:

x = np.array([[1.3], [3.7], [5.1], [2.4], [3.4]])
y =np.array([[5], [3], [5], [2], [3]])
ee.micd(x, y)

0.0

One more question if possible:
the fact that we are using list of lists implies that in npeet functions we can use m-dimensional arrays for x and y, right?
so we can estimate a MI of a 3-dimensional x on 2-dimensional y and it will be supported?

@gregversteeg
Copy link
Owner

Yes, I think so!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants