Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM Bitcode Linker: A self contained linker for nvptx and other targets #117458

Merged
merged 6 commits into from Mar 11, 2024

Conversation

kjetilkjeka
Copy link
Contributor

@kjetilkjeka kjetilkjeka commented Oct 31, 2023

This PR introduces a new linker named llvm-bitcode-linker. It is a self-contained linker that can be used to link programs in llbc before optimizing and compiling to native code. It will first be used internally in the Rust compiler to enable tests for the nvptx64-nvidia-cuda target as the original rust-ptx-linker is deprecated. It will then be provided to users of the nvptx64-nvidia-cuda target with the purpose of linking ptx. More targets than nvptx will also be supported eventually.

The PR introduces a new unstable LinkerFlavor for the compiler. The compiler will also not be shipped with rustc but most likely instead be shipped in it's own unstable component (a follow up PR will be opened for this). This means that merging this PR should not add any stability guarantees.

When more details of self-contained is implemented it will only be possible to use the linker when -Clink-self-contained=+linker is passed.

Original Description

When this PR was created it was focused a bit differently. The original text is preserved here in case there's some interests in it

I have experimenting with approaches to replace the ptx-linker and enable the nvptx target tests again. I think it's time to get some feedback on the approach.

The problem

The only useful linker for the nvptx target is this crate. Since this linker performs linking on llvm bitcode it needs to track the llvm version of rustc and use the same format. It has not been maintained for 3+ years and must be considered abandoned. Over the years rust have upgraded LLVM while the linker has been left to bitrot. It is no longer in a usable state.

Due to the difficulty of keeping the ptx-linker up to date outside of tree the nvptx tests was disabled a long time ago. It was previously discussed if adding the ptx-linker to the rust repo would be a possibility. My efforts in doing this stopped at getting an answered if the license would prohibit it from inclusion in the Rust repo. I therefore concluded that a re-write would be necessary.

The possible solution presented here

The llvm tools know perfectly well how to link and optimize llvm bitcode. Each of them only perform a single task, and are therefore a bit cumbersome to call with the current linker approach rustc takes.

This PR adds a simple tool (current name embedded-linker) which can link self contained (often embedded) programs in llvm bitcode before compiling to the target format. Optimization will also be performed if lto is enabled. The rust compiler will make a single invocation to this tool, while the tool will orchestrate the many calls to the llvm tools.

The questions

  • Is having control over the nvptx linking and therefore also tests worth it to add such tool? or should the tool live outside the rust repo?
  • Is the approach of calling llvm tools acceptable? Or would we want to keep the ptx-linker approach of using the llvm library? The tools seems to provide more simplicity and stability, but more intermediate files are being written. Perhaps there also are some performance penalty for the calling tools approach.
  • What is the process for adding such tool? MCP?
  • Does adding llvm-link to the llvm-tool component require any process?
  • Does it require some sort of FCP to remove ptx-linker as the default linker for ptx? Or is it sufficient that using the upstream ptx-linker is broken in its current state. it is possible to use a somewhat patched version of ptx-linker.

@rustbot
Copy link
Collaborator

rustbot commented Oct 31, 2023

r? @oli-obk

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Oct 31, 2023
@kjetilkjeka
Copy link
Contributor Author

@rustbot label +O-NVPTX

@rustbot rustbot added the O-NVPTX Target: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.html label Oct 31, 2023
@petrochenkov petrochenkov self-assigned this Oct 31, 2023
@kjetilkjeka kjetilkjeka changed the title Embedded linker Embedded linker: Alternative approach to the current ptx-linker Oct 31, 2023
@rust-log-analyzer

This comment has been minimized.

@oli-obk

This comment was marked as resolved.

@rustbot

This comment was marked as resolved.

@oli-obk

This comment was marked as resolved.

@rustbot

This comment was marked as resolved.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 31, 2023

cc @nagisa and @Zoxc tor previous involvement

@kjetilkjeka
Copy link
Contributor Author

Cc: @workingjubilee as I remember we talked about this in this issue denzp/rust-ptx-linker#34 (comment)

@bjorn3
Copy link
Member

bjorn3 commented Oct 31, 2023

Can we require fat LTO for nvptx? When rustc does fat LTO it already combines all crates into a single object file.

@juntyr
Copy link
Contributor

juntyr commented Nov 1, 2023

I'm really excited to hear that PTX might gain a maintained linker again!

Two years ago I was using Rust to write single-source kernels for NVIDIA GPUs, where LTO and inlining were crucial. I at some point had to fork the ptx-linker crate to keep LTO working with newer LLVM versions. It would be awesome if linking would "just work" instead again, so that the linker's is kept in sync and supports fat LTO.

Thank you so much for working on this ❤️

@kjetilkjeka
Copy link
Contributor Author

kjetilkjeka commented Nov 1, 2023

Can we require fat LTO for nvptx? When rustc does fat LTO it already combines all crates into a single object file.

It would certainly be possible, rust is capable of calling the right functions in llvm to make it work. But it doesn't work "out of the box" and I'm not certain it's the correct thing to do long term eiteher.

If we decided to use only lto="fat" we would disallow lto="thin" which helps a lot with compile times. One could argue that the general ptx program is small enough for this to be acceptable, but it doesn't feel right to leave such an optimization out without an excellent reason. Perhaps a bigger problem is that linker-plugin-lto would not work which would make cross-language lto more difficult.

When it comes to compiler change we would still need to add llc as a linker known to the compiler. This is required to compile from llvm-bitcode to native. The linker would be an unconventional one as it would not actually be able to link anything and only accept a single bitcode file as input.

When I tested it just now it seems like libcompiler_builtins still produces an .rlib which is linked by the linker even when lto = "fat". I'm not sure if this is a bug or expected behavior, but it must be resolved before we may assume that the linker is only given a single bitcode file. (Note: -Zbuild-std=core is required due to not distributing this library and LTO towards libcore is required to work around some bugs right now and desired at the long term since calling functions on ptx is tremendously expensive.)

As an addition, there might be future benefits an approach as this embedded-linker can have to other targets than ptx as well. Maybe it can be used to bring thin lto to embedded targets where the native linker doesn't support it. It can also provide compilation without requiring to install an external target specific linker.

Is this an approach we would like to explore further even with the known deficiencies listed above? Is the main advantage that we avoid yet another tool that needs to be maintained? Or that we get more control using libLLVM compared to calling the llvm tools?

@bjorn3
Copy link
Member

bjorn3 commented Nov 1, 2023

When I tested it just now it seems like libcompiler_builtins still produces an .rlib which is linked by the linker even when lto = "fat".

That will be fixed soon. There is an open PR for making compiler-builtins participate in LTO.

@rust-log-analyzer

This comment has been minimized.

@bors

This comment was marked as resolved.

@kjetilkjeka
Copy link
Contributor Author

There are a few questions in my original post, but I think the one that stops me from moving forward is figuring out whether this is an approach that can be accepted. Are there any processes required to get something like this MR approved (MCP?) or is it just a matter of completing the implementation?

Insight like "this is probably not going to be accepted, but bjorn3's alternative probably will be" would also be very helpful to help me avoid putting effort into something that is probably not going to go anywhere.

@petrochenkov
Copy link
Contributor

petrochenkov commented Nov 7, 2023

I planned to review it in a couple of weeks.
Generally, if the new tool is interface-compatible with the previous ptx-linker then it doesn't need a new linker flavor, it can reuse the old one.
Also, whether tools are looked up inside the rustc toolchain (e.g. the new linker) or elsewhere (e.g. the old ptx-linker) is controlled by the -C link-self-contained option, you may check how it works as well.

@petrochenkov
Copy link
Contributor

I think the general approach is fine, we can ship a small additional tool, especially if it's not enabled by default.
The linker just needs to reuse the old PTX linker flavor and the linker's location should be determined through -C link-self-contained, as I said in the comment above.
@rustbot author

@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 23, 2023
@rustbot
Copy link
Collaborator

rustbot commented Mar 11, 2024

There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged.

You can start a rebase with the following commands:

$ # rebase
$ git rebase -i master
$ # delete any merge commits in the editor that appears
$ git push --force-with-lease

The following commits are merge commits:

@rustbot rustbot removed has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 11, 2024
@kjetilkjeka
Copy link
Contributor Author

@petrochenkov Yet another conflict in change_trackers.rs resolved.

@rustbot label -S-waiting-on-author +S-waiting-on-review

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 11, 2024
@petrochenkov
Copy link
Contributor

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 11, 2024
@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Mar 11, 2024

📌 Commit 843dd28 has been approved by petrochenkov

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 11, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
…kingjubilee

Rollup of 15 pull requests

Successful merges:

 - rust-lang#116791 (Allow codegen backends to opt-out of parallel codegen)
 - rust-lang#116793 (Allow targets to override default codegen backend)
 - rust-lang#117458 (LLVM Bitcode Linker: A self contained linker for nvptx and other targets)
 - rust-lang#119385 (Fix type resolution of associated const equality bounds (take 2))
 - rust-lang#121438 (std support for wasm32 panic=unwind)
 - rust-lang#121893 (Add tests (and a bit of cleanup) for interior mut handling in promotion and const-checking)
 - rust-lang#122080 (Clarity improvements to `DropTree`)
 - rust-lang#122152 (Improve diagnostics for parenthesized type arguments)
 - rust-lang#122166 (Remove the unused `field_remapping` field from `TypeLowering`)
 - rust-lang#122249 (interpret: do not call machine read hooks during validation)
 - rust-lang#122299 (Store backtrace for `must_produce_diag`)
 - rust-lang#122318 (Revision-related tweaks for next-solver tests)
 - rust-lang#122320 (Use ptradd for vtable indexing)
 - rust-lang#122328 (unix_sigpipe: Replace `inherit` with `sig_dfl` in syntax tests)
 - rust-lang#122330 (bootstrap readme: fix, improve, update)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit e1ceadc into rust-lang:master Mar 11, 2024
11 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Mar 11, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
Rollup merge of rust-lang#117458 - kjetilkjeka:embedded-linker, r=petrochenkov

LLVM Bitcode Linker: A self contained linker for nvptx and other targets

This PR introduces a new linker named `llvm-bitcode-linker`. It is a `self-contained` linker that can be used to link programs in `llbc` before optimizing and compiling to native code. It will first be used internally in the Rust compiler to enable tests for the `nvptx64-nvidia-cuda` target as the original `rust-ptx-linker` is deprecated. It will then be provided to users of the `nvptx64-nvidia-cuda` target with the purpose of linking ptx. More targets than nvptx will also be supported eventually.

The PR introduces a new unstable `LinkerFlavor` for the compiler. The compiler will also not be shipped with rustc but most likely instead be shipped in it's own unstable component (a follow up PR will be opened for this). This means that merging this PR should not add any stability guarantees.

When more details of `self-contained` is implemented it will only be possible to use the linker when `-Clink-self-contained=+linker` is passed.

<details>
  <summary>Original Description</summary>

**When this PR was created it was focused a bit differently. The original text is preserved here in case there's some interests in it**

I have experimenting with approaches to replace the ptx-linker and enable the nvptx target tests again. I think it's time to get some feedback on the approach.

### The problem
The only useful linker for the nvptx target is [this crate](https://github.com/denzp/rust-ptx-linker). Since this linker performs linking on llvm bitcode it needs to track the llvm version of rustc and use the same format. It has not been maintained for 3+ years and must be considered abandoned. Over the years rust have upgraded LLVM while the linker has been left to bitrot. It is no longer in a usable state.

Due to the difficulty of keeping the ptx-linker up to date outside of tree the nvptx tests was [disabled a long time ago](rust-lang@f8f9a28). It was [previously discussed](rust-lang#96842 (comment)) if adding the ptx-linker to the rust repo would be a possibility. My efforts in doing this stopped at getting an answered if the license would prohibit it from inclusion in the [Rust repo](rust-lang#96842 (comment)). I therefore concluded that a re-write would be necessary.

### The possible solution presented here
The llvm tools know perfectly well how to link and optimize llvm bitcode. Each of them only perform a single task, and are therefore a bit cumbersome to call with the current linker approach rustc takes.

This PR adds a simple tool (current name `embedded-linker`) which can link self contained (often embedded) programs in llvm bitcode before compiling to the target format. Optimization will also be performed if lto is enabled. The rust compiler will make a single invocation to this tool, while the tool will orchestrate the many calls to the llvm tools.

### The questions
 - Is having control over the nvptx linking and therefore also tests worth it to add such tool? or should the tool live outside the rust repo?
 - Is the approach of calling llvm tools acceptable? Or would we want to keep the ptx-linker approach of using the llvm library? The tools seems to provide more simplicity and stability, but more intermediate files are being written. Perhaps there also are some performance penalty for the calling tools approach.
 - What is the process for adding such tool? MCP?
 - Does adding `llvm-link` to the llvm-tool component require any process?
 - Does it require some sort of FCP to remove ptx-linker as the default linker for ptx? Or is it sufficient that using the upstream ptx-linker is broken in its current state. it is possible to use a somewhat patched version of ptx-linker.
</details>
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 15, 2024
…mponent, r=Mark-Simulacrum

Distribute LLVM bitcode linker as a preview component

The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future.

The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now.

If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools.

Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it.

`@petrochenkov` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed.

CC: `@Mark-Simulacrum` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 15, 2024
…mponent, r=Mark-Simulacrum

Distribute LLVM bitcode linker as a preview component

The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future.

The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now.

If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools.

Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it.

``@petrochenkov`` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed.

CC: ``@Mark-Simulacrum`` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 15, 2024
Rollup merge of rust-lang#123423 - kjetilkjeka:llvm_bitcode_linker_component, r=Mark-Simulacrum

Distribute LLVM bitcode linker as a preview component

The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future.

The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now.

If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools.

Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it.

``@petrochenkov`` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed.

CC: ``@Mark-Simulacrum`` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) A-testsuite Area: The testsuite used to check the correctness of rustc O-NVPTX Target: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.html S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants