Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU tests failing (and taking a long time to run) #1357

Open
2 of 3 tasks
ivirshup opened this issue Feb 7, 2024 · 7 comments
Open
2 of 3 tasks

GPU tests failing (and taking a long time to run) #1357

ivirshup opened this issue Feb 7, 2024 · 7 comments

Comments

@ivirshup
Copy link
Member

ivirshup commented Feb 7, 2024

The GPU tests seem to be failing, and taking at ~1.5 hours to do so, which is a little expensive

It looks like this was triggered when #1354 was merged, but I doubt this is the cause. I suspect it has to do with CUDA versions

Rapids single cell's tests aren't failing, but also has some slight different libraries installed. Includes: libcublas and python

TODO:

cc: @flying-sheep @Intron7


Failure messages look like:

FAILED anndata/tests/test_views.py::test_view_of_view[cupy_csc-spmatrix_bool_subset-spmatrix_bool_subset] - RuntimeError: Runtime compilation failed
@ivirshup
Copy link
Member Author

ivirshup commented Feb 7, 2024

Diff between environments: https://www.diffchecker.com/VUIKrubT/

Interestingly, only the build number for cupy and cupy-core are different. numba, llvmlite, libgcc, and a few other libs are notably different. These could be related since it's a compiler error.

@ivirshup ivirshup added this to the 0.10.6 milestone Feb 7, 2024
@ivirshup
Copy link
Member Author

ivirshup commented Feb 7, 2024

We've changed how cuda is installed and pinned cuda to 11.8, which was released in 2022.

Things still seem to fail with the new installation method and cuda 12.3 and 12.2.

I've merged the change to use 11.8 (plus a timeout and better version reporting) as a temporary fix for now.

@flying-sheep
Copy link
Member

Please remove the toplevel .ci directory that holds one file. That file can live in .github.

@ivirshup
Copy link
Member Author

ivirshup commented Feb 8, 2024

That directory will hold more files with the minimum version test job

@flying-sheep
Copy link
Member

OK, but currently, that PR calls the directory ci, not .ci. Please unify that then.

@Intron7
Copy link
Member

Intron7 commented Feb 8, 2024

@flying-sheep #1363 does that

@flying-sheep
Copy link
Member

Awesome! Also thanks for helping fix this!

@ivirshup ivirshup modified the milestones: 0.10.6, 0.10.7 Mar 11, 2024
@flying-sheep flying-sheep modified the milestones: 0.10.7, 0.10.8 Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants