Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanpy not working correctly with scikit-learn 0.21.1 #654

Closed
danielStrobl opened this issue May 21, 2019 · 7 comments
Closed

Scanpy not working correctly with scikit-learn 0.21.1 #654

danielStrobl opened this issue May 21, 2019 · 7 comments

Comments

@danielStrobl
Copy link
Contributor

danielStrobl commented May 21, 2019

Hey!

Scanpy does not seem to work correctly together with scikit-learn 0.21.1.
When running the PBMC clustering tutorial (https://github.com/theislab/scanpy-tutorials/blob/master/pbmc3k.ipynb), the produced UMAP plots look very different to the reference.
wrong_umap

By downgrading scikit-learn to 0.20.0, everything works fine.
The problem seems to arise already at the computation of the neighborhood graph, as the clustering is also different.

@LuckyMD
Copy link
Contributor

LuckyMD commented May 21, 2019

Just to add to this, PCA plots look fine with the newer scikit-learn I believe. Maybe it's the umap neighbourhood graph function depending on sklearn for something?

@gokceneraslan
Copy link
Collaborator

gokceneraslan commented May 21, 2019

Oh that's reaaaally bad. I did a quick git bisect on sklearn:

image

Here is the commit that broke our umaps: scikit-learn/scikit-learn#13554

@LuckyMD
Copy link
Contributor

LuckyMD commented May 21, 2019

Should this be relayed to scikit-learn then? If so, that should probably be done by someone who knows where in the sc.pp.neighbors() function this is breaking...

@LuckyMD
Copy link
Contributor

LuckyMD commented May 21, 2019

@flying-sheep mentioned this was known and already fixed though?

@gokceneraslan
Copy link
Collaborator

OK, seems to be fixed in sklearn master branch (probably scikit-learn/scikit-learn#13910), but this is such a huge bug and it has been going on since May 9th :( We could have blacklisted sklearn versions 0.21.0 and 0.21.1 if it was known, no? Some colleagues mentioned weird UMAP results with scanpy actually, it turns out they upgraded their sklearn...

gokceneraslan pushed a commit to broadinstitute/regevlab-jupyter-docker that referenced this issue May 21, 2019
@flying-sheep
Copy link
Member

flying-sheep commented May 22, 2019

@flying-sheep mentioned this was known and already fixed though?

I meant the other breakage due to the scipy update, sorry.

We could have blacklisted sklearn versions 0.21.0 and 0.21.1 if it was known, no?

We should do that now. We can do sklearn >= 0.19.1, != 0.21.0, != 0.21.1 I think.

flying-sheep added a commit that referenced this issue May 22, 2019
@gokceneraslan
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants