Default build system support for CUDA/PTX #19302

maleadt · 2016-11-11T14:12:03Z

As master is getting closer to be compatible with CUDAnative.jl, I'd like to discuss how we make the default build and/or the binary releases compatible with it. I think it would be a great addition for users to be able to target GPUs without too much effort.

I've been using LLVM 3.9 on my branch, including some patches to get rid of some PTX-specific bugs. I guess Upgrade to LLVM 3.9 #19123 is bound to happen before, at which point including some extra patches shouldn't hurt?
The next obvious change is to enable the PTX back-end in the default build of LLVM. I'm not sure how hard we try to keep libLLVM small (vs. eg. the effort to keep sysimg small), but FWIW enabling the PTX target next to X86 increases the LLVM library file size from 36624 to 37664 kB (Linux x64), a mere 3% increase.
Lastly, CUDAnative.jl uses LLVM.jl, which wraps and extends the C API. This requires llvm-config as well as the LLVM headers, both of which aren't part of the binary build. @vtjnash suggested creating both a regular build tarball, and a extras or tools archive containing non-critical headers and tools (this could also include, eg., libclang for Cxx.jl).

cc @vchuravy @tkelman @staticfloat

Added a tentative 0.6.0 milestone, it would be nice to have experimental GPU support in that version.
But I haven't been on the triage calls, so please change if that has already been decided against.

The text was updated successfully, but these errors were encountered:

tkelman · 2016-11-11T14:25:49Z

I don't think we should be adding "nice to have" to the milestone if we plan on feature freezing next month. If this can be made to work and doesn't break anything or make the binaries that noticeably larger, we can do it, but I wouldn't consider it release blocking. In terms of llvm-config and headers for packages, I think the packages are going to need to figure out how to build and distribute binaries for the platforms they want to support. I don't want the base buildbots being responsible for providing package binaries for anything beyond "stdlib." There's not enough bandwidth or S3 budget for providing to all packages so we have to say no to some, and then you get into playing favorites - what conditions qualify a package as "important enough" to provide binaries for? I think the only sensible answer for that is if it's in the future stdlib then it qualifies, if it isn't then it doesn't.

maleadt · 2016-11-11T14:39:57Z

The "nice to have" extends it just being nice: I'd like to get real user feedback on how people want to use GPU codegen support in order to improve and stabilize the interface (params, hooks) with inference and codegen before 1.0.

Moving the burden of providing the necessary binaries to the packages (LLVM.jl, in my case) is definitely an option, but given how there's gone quite some development into tweaking and tuning the buildbots I'm not sure how easy it would be to faithfully reproduce that on eg. Travis?

ViralBShah · 2016-11-11T16:17:28Z

I think in the early user feedback stage, we can go with a Cxx like model. Let's not worry about the buildbots, but get it to a stage where people can easily build from source and make it work.

Patches can certainly be included in master for ease of building - even though gpu support may only work in source builds.

As for S3 budget, I don't think adding some GPU support is going to even register. Lesser of a concern. As gpu support starts becoming more and more reliable and usable, we will certainly want to provide more support - but seems like more of a 1.0 timeframe discussion. This is not a package thing, but a core compiler capability, the way I see it.

tkelman · 2016-11-11T17:03:30Z

S3 budget was referring to building multiple classes of binary distributions, with separate binaries for things like headers or libraries/tools only used at build time by base.

We will be working on making a buildbot-like generic binary build system easier to reproduce for packages anyway. The current buildbot setup is fragile and not automated enough for it to really scale any further beyond what we're currently using it to do.

StefanKarpinski · 2016-11-11T17:53:49Z

@maleadt: this can go on the milestone if you're willing to own it and make it happen by end of year (feature freeze for 0.6). Regardless, I think we should switch to LLVM 3.9 ASAP.

ViralBShah · 2016-11-11T19:14:44Z

If we are to move to LLVM 3.9, shouldn't we be doing it about now, to give adequate testing time?

StefanKarpinski · 2016-11-11T19:55:13Z

Yes – ASAP. I've been preaching this for a while. There was some pushback but I forget why.

maleadt · 2016-11-14T16:17:44Z

I'll focus in getting source builds GPU compatible for 0.6, which should be pretty easy and non-controversial (enable PTX back-end, add some patches).

I also spent some time on figuring out if it's possible to distribute binaries for LLVM.jl, but that most likely won't work, see LLVM.jl/#10. But I agree that figuring this out is 1.0 territory.

tkelman · 2016-12-29T18:28:47Z

closed by #19323 + #19678

maleadt · 2016-12-30T09:46:53Z

Point 3 hasn't really been resolved yet, but we can do without for now. It'll require users wanting GPU support to do a source build, albeit an unmodified one, and we can work on providing the necessary auxiliary files at a later time.

Also, for those wanting to try out, there's some outstanding issues in CUDAnative due to #17057, but I'll be working on those next week.

tkelman · 2016-12-31T05:02:21Z

How much extra code are you needing to compile that needs llvm-config and the headers? If it's a small set of functions you want to expose, could we put entry points to them here in libjulia?

maleadt · 2017-01-04T13:37:28Z

(missed your comment)

It's not that much code, but it needs to change in lockstep with the rest of the package so I'd rather keep it there. Although it would solve the problem of course...

tkelman · 2017-01-04T13:39:46Z

Can you ship a binary version in the package then?

maleadt · 2017-01-04T13:44:05Z

I tried, but couldn't get it to work:

This is probably not going to work. The extras library uses some LLVM C++ API's, eg. here, which might result in object-layout dependent code getting baked into the extras library. E.g., continuing on the example above, Function::getAttributes and AttributeSet::isEmpty are both inlined, resulting in a call to AttributeSetImpl::getNumSlots which is defined in the header, hence compiled into our library.

And indeed, memory layout is mostly implementation defined, and does differ between eg. Travis' compiler (on their trusty image) and my local clang 3.8, resulting in faulty behavior.

If anybody reading this has any suggestions, please chime in.

tkelman · 2017-01-04T13:47:18Z

Were you building with the same compiler as the Julia binaries, or the system compiler? The latter can easily differ in ABI. Either way, I guess revisit when things stabilize a bit in the package? You can always be more specific about the package's supported version range of Julia if that helps make this kind of thing more predictable going forward.

maleadt · 2017-01-04T14:10:03Z

Yeah I wouldn't want to rush that code into base right now, having users test it and revisit this in a couple of months seems fine.

But continuing on ABI differences, wouldn't that imply that it's never safe to build and and link against a prebuilt C++ library (ie. libLLVM) unless the toolchain is identical (which it never will)?

tkelman · 2017-01-04T14:13:29Z

The toolchain will be identical if we standardize the way packages build binaries. On Linux especially doing generic binaries does require that you use a uniform toolchain.

maleadt added the domain:building Build system, or building Julia or its dependencies label Nov 11, 2016

maleadt added this to the 0.6.0 milestone Nov 11, 2016

maleadt mentioned this issue Nov 14, 2016

Enable LLVM NVPTX back-end #19323

Merged

tkelman closed this as completed Dec 29, 2016

maleadt mentioned this issue Jan 9, 2017

Wrap LLVMBuildICmp maleadt/LLVM.jl#12

Closed

SimonDanisch mentioned this issue Jan 19, 2017

WIP: Windows support maleadt/LLVM.jl#16

Closed

maleadt mentioned this issue Jan 19, 2017

Support for binary Julia distributions maleadt/LLVM.jl#17

Closed

ViralBShah added the domain:gpu Affects running Julia on a GPU label Sep 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default build system support for CUDA/PTX #19302

Default build system support for CUDA/PTX #19302

maleadt commented Nov 11, 2016

tkelman commented Nov 11, 2016

maleadt commented Nov 11, 2016

ViralBShah commented Nov 11, 2016 •

edited

tkelman commented Nov 11, 2016

StefanKarpinski commented Nov 11, 2016

ViralBShah commented Nov 11, 2016

StefanKarpinski commented Nov 11, 2016

maleadt commented Nov 14, 2016

tkelman commented Dec 29, 2016

maleadt commented Dec 30, 2016

tkelman commented Dec 31, 2016

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

Default build system support for CUDA/PTX #19302

Default build system support for CUDA/PTX #19302

Comments

maleadt commented Nov 11, 2016

tkelman commented Nov 11, 2016

maleadt commented Nov 11, 2016

ViralBShah commented Nov 11, 2016 • edited

tkelman commented Nov 11, 2016

StefanKarpinski commented Nov 11, 2016

ViralBShah commented Nov 11, 2016

StefanKarpinski commented Nov 11, 2016

maleadt commented Nov 14, 2016

tkelman commented Dec 29, 2016

maleadt commented Dec 30, 2016

tkelman commented Dec 31, 2016

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

maleadt commented Jan 4, 2017

tkelman commented Jan 4, 2017

ViralBShah commented Nov 11, 2016 •

edited