Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call to array-like unique in ideal_kernel results in type error. #25

Open
greenstick opened this issue Dec 10, 2022 · 0 comments
Open

Comments

@greenstick
Copy link

greenstick commented Dec 10, 2022

First off, I have to say, thank you for writing this library. It's great and I've found it helpful and will be citing it in my research. That said, onto the issue: in the function MKLpy.utils.misc.ideal_kernel, the call to Y.unique() is not general. Specifically, in working with the function MKL.metrics.alignment.alignment_yy, which calls the ideal_kernel function as a subroutine, I'm passing in an np.ndarray as my y parameter to alignment_yy. The np.ndarray object does not have a unique method, but torch.Tensor does, which I suspect is why it was written as a method of the object in ideal_kernel. This points to an issue where it is assumed the user will pass an array-like object to ideal_kernel but the code is not sufficiently general to support an np.ndarray. I see two possible solutions here:

  1. Stick to PyTorch and instead of using array_like.unique() apply the function torch.unique() to the array-like object, which removes the assumption that the unique function is a method of the array-like object. Of course, type consistency would be an issue seeing as this would cast the array-like object to a Torch.tensor. Personally, I always expect the datatype being input and returned should be consistent with the user's expectation and in this instance I don't see a reason to expect not to receive an np.ndarray back.
  2. Alternatively, within the function, enable duck typing by disambiguating the approach to computing the unique array on the basis of the specific array type passed in. Return the same type that was input. So, if it's a Torch.tensor return that, a numpy.ndarray, return that, etc.

As a final note, it wasn't until I came across this issue I became aware that PyTorch was being used in some places. PyTorch is wonderful, but I wonder if it's necessary? It's my understanding that letting NumPy do the heavy numeric lifting wouldn't present an issue for compatibility with PyTorch, but it seems the case is less so the other way around. Also, maybe I haven't explored MKLpy enough, but it seems to me that the automatic differentiation capabilities of PyTorch are only lightly used (of course, I may be wrong about this). If that's the case, it may be worth using a different, lighter weight library with a more NumPy-like API, like Jax. An example of a similar use of Jax is this kernel library, which leverages Scikit-learn, much as MKLpy does, too.

Happy to discuss further what a solution might look like and contribute to any updates that may be required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant