Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(wip): cupy histograms #1095

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

feat(wip): cupy histograms #1095

wants to merge 9 commits into from

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented May 15, 2024

work in progress on using cupy in the old coffea-hist package as a demonstrator

@martindurant
Copy link

(cf https://docs.cupy.dev/en/stable/reference/generated/cupy.histogramdd.html , which is the standard regular-array API conforming to the numpy version - no boost or anything)

@lgray
Copy link
Collaborator Author

lgray commented Jun 3, 2024

@martindurant - cool, it appears to use the same techniques (fill by atomics) so it'll be subject to the same scaling limitations I'm seeing. However, the number of calls to fill is a bit more lean so maybe it's worth backing an implementation with it. I'll have to try some benchmarks.

Otherwise - there's significant functionality missing from the cupy hists that we'll still have to add on top, if it turns out to run faster in the first place.

@martindurant
Copy link

Otherwise - there's significant functionality missing from the cupy hists that we'll still have to add on top, if it turns out to run faster in the first place.

Yes, I expect this is the case. dask-histogram also uses boost and adds a numpy- compatible API on top for those that expect it; and of course, it's designed to work with awkward. I expect there's an amount of sharing and refactoring that can eventually be done.

@lgray
Copy link
Collaborator Author

lgray commented Jun 3, 2024

Yeah - I think it is all possible. Right now is really getting all the pipe-fittings in place. I'll let you know if there's any clear win in the benchmarks.

@lgray
Copy link
Collaborator Author

lgray commented Jun 3, 2024

@Saransh-cpp: @jpivarski and I talked last Friday and it came up that you might be interested in taking this "pilot project" and turning it into a full-blow UHI compatible histogramming interface (a la scikit-hep/hist), but for cupy/cuda histograms.

What's in this PR has the necessary functionality for HEP and we can convert to scikit-hep/hist, but it would be nice to have a cohesive ecosystem and only convert to CPU memory-space at the last moment. This would grant us more or less infinite scaling.

We also has some ideas towards warpwise-distributed histograms where a (collection) of warps would tend to a sub-range of bins so that filling can be done more in parallel. This old implementation description demonstrates that if you stick to a warp (i.e. 32 bins) and replicate histograms to do filling parallel you can reach 10GB/s filling rates, because there's no use of atomics.

This also has interesting parallels to cluster-distributed histograms where a (relatively enormous) histogram could be distributed across a whole dask cluster and achieve scaling to 100s of GBs in size or more. This would effectively remove scaling limitations for histograms for the foreseeable future and is probably important for achieving precision HL-LHC analyses.

Anyway - please let us know if you are interested in turning this into a more mature package and possibly adding features to it! We're happy to answer any questions you may have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants