Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for inline assembly (asm!) #72016

Closed
3 of 4 tasks
Amanieu opened this issue May 8, 2020 · 221 comments
Closed
3 of 4 tasks

Tracking Issue for inline assembly (asm!) #72016

Amanieu opened this issue May 8, 2020 · 221 comments
Labels
A-inline-assembly Area: inline asm!(..) B-unstable Feature: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-asm `#![feature(asm)]` (not `llvm_asm`) finished-final-comment-period The final comment period is finished for this PR / Issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems

Comments

@Amanieu
Copy link
Member

Amanieu commented May 8, 2020

This is a tracking issue for the RFC 2873 (rust-lang/rfcs#2873).
The feature gate for the issue is #![feature(asm)].

Stabilization

Steps

Implementation history


January status update


April status update


July status update


November stabilization report for FCP
November FCP checklist

@Amanieu Amanieu added A-inline-assembly Area: inline asm!(..) C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels May 8, 2020
@Amanieu Amanieu changed the title Tracking Issue for inline assembly Tracking Issue for inline assembly (asm!) May 8, 2020
@jonas-schievink jonas-schievink added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label May 8, 2020
@jonas-schievink jonas-schievink added the F-asm `#![feature(asm)]` (not `llvm_asm`) label May 8, 2020
@joshtriplett
Copy link
Member

Note: if you'd like to report an issue in inline assembly, please report it as a separate github issue, and just link to this one. Please don't report issues in inline assembly as comments on this tracking issue.

@BartMassey
Copy link
Contributor

Should the asm! macro be available directly from the prelude as it is now, or should it have to be imported from std::arch::$ARCH::asm? The advantage of the latter is that it would make it explicit that the asm! macro is target-specific, but it would make cross-platform code slightly longer to write.

Definitely want the explicit import. This would also make it a bit clearer what's going on when somebody tries to compile an old-style asm! with a newer compiler, which is going to happen a lot: you'd get an unresolved symbol rather than mysterious syntax errors.

@jonas-schievink jonas-schievink added the WG-embedded Working group: Embedded systems label May 25, 2020
@joshtriplett
Copy link
Member

joshtriplett commented May 26, 2020 via email

@petrochenkov
Copy link
Contributor

Minor compatibility concern: asm! supports some LLVM-isms like C style comments /* comment */ (#73056 (comment)).

Other backends using external assemblers may need to do some pre-processing before passing the asm text to them.

@programmerjake
Copy link
Member

Minor compatibility concern: asm! supports some LLVM-isms like C style comments /* comment */ (#73056 (comment)).

GNU Assembler (IIRC defined as the official assembly dialect by the RFC), and by extension GCC, supports C-style comments too, so they are definitely not LLVM-specific.

@newpavlov
Copy link
Contributor

newpavlov commented Jun 9, 2020

Namespacing the asm! macro

I am also in favor of explicit imports from arch modules. I would suggest going even further and introduce separate macros for each supported target (e.g. asm_x86!, asm_arm!, etc.).

@joshtriplett
Copy link
Member

Some feedback from various places in response to the blog post. (I'm skipping general expressions of awesomeness, though there have been many; I'm quoting those that have specific feedback.)

@amluto (Andy Lutomirski, prominent Linux kernel developer who works on a lot of low-level x86) at https://news.ycombinator.com/item?id=23467108 :

Rust folks, thank you so much for making this far, far better than GCC’s asm syntax for C. Also, thank you for using Intel x86 syntax instead of AT&T.
It would be delightful if GCC were to adopt something similar after this stabilizes in Rust.

Several people on HN were a little confused about the backward-compatibility story here; in the future, when we talk about features that have only existed on nightly, we need to be more clear in the future about messaging around our stability policy.

Several people wondered why this used a string constant; it'd be good to explain that in documentation. And this doesn't mean the assembly isn't parsed, it means it isn't parsed by rustc (it's parsed by LLVM in the backend).

We really should have mentioned, in the blog post, that AT&T syntax was supported with options(att_syntax), so that people didn't think they have to translate all their assembly. I've prepared an update to the blog post mentioning that.

Someone mentioned preferring an "explicit clobber" syntax (like clobber("ax")) rather than having to write out("ax") _. I can understand that.

Closely related, I think we should definitely have both "clobber all function-clobbered registers" and "clobber all general-purpose registers" options.

@amluto
Copy link

amluto commented Jun 9, 2020

I’ll add one more comment: please document what happens if you try to use not-quite-general-purpose registers as operands (clobber or otherwise). The most important case is the PIC register. GCC does not appreciate inline asm that clobbers the PIC register. IMO it should be allowed, especially for things like CPUID on x86_32.

This also includes thinks like EBP/RBP. If I’m building with frame pointers on or I do something that forces a frame pointer, is RBP available? RSP is another example — presumably it’s not available.

@Amanieu
Copy link
Member Author

Amanieu commented Jun 10, 2020

Several people wondered why this used a string constant; it'd be good to explain that in documentation. And this doesn't mean the assembly isn't parsed, it means it isn't parsed by rustc (it's parsed by LLVM in the backend).

Actually what I feel a lot of people are really asking is: "Why isn't this just like MSVC's __asm or D's inline assembly where I can just write the asm directly and the compiler will figure out the input/output/clobbers automagically?"

The issue with that approach is that in a lot of cases, we can't actually figure out the constraints just from the assembly code. Common examples are call and syscall instructions which can have an arbitrary ABI.

Finally, if you really want it, you can parse the asm code in a proc macro and derive the necessary input/output/clobber operations from it. In fact, this is what Clang does to support the MS __asm: it parses the ASM in the front-end (with some support from LLVM's MC layer) and rewrites the asm to use standard LLVM constraint codes.

I’ll add one more comment: please document what happens if you try to use not-quite-general-purpose registers as operands (clobber or otherwise). The most important case is the PIC register. GCC does not appreciate inline asm that clobbers the PIC register. IMO it should be allowed, especially for things like CPUID on x86_32.

This is actually quite tricky and I believe it is a bug for the back-end (GCC/LLVM) to silently accept the use of these registers and then generate wrong code (for example by not preserving the PIC base around the asm). This should be fixed in the back-end or at the very least cause a compile-time error.

We currently always disallow the use of the stack pointer and frame pointer as operands for inline asm in the front-end. However the exact set of reserved registers depends not only on the current target but can also be different depending on the properties of the function:

  • The frame pointer (EBP) is only reserved if the function needs a frame pointer (e.g. dynamic alloca).
  • If the function requires stack realignment then a "base pointer" is also reserved (EBP on x86).

These properties can vary depending on the optimization level and inlining so there is no way we can selectively enforce this in the front-end. A blanket ban on the use of registers that may be reserved is also a non-starter since these registers are commonly used (e.g. for syscall arguments).

@joshtriplett
Copy link
Member

joshtriplett commented Jun 10, 2020

One more issue I've observed in several contexts: everyone formats asm! statements differently, and we should 1) come up with guidance on how to do so, and 2) implement that guidance in rustfmt.

Notably, this includes how to format both single-line and multi-line assembly statements.

EDIT: originally, in this comment, I proposed an initial set of requirements. To keep format bikeshedding out of this tracking issue, I've moved that to an issue on the fmt-rfcs repo.

@unageek
Copy link

unageek commented Jun 10, 2020

I'm trying to implement an optimization barrier like black_box for a f64 value with asm!.

The following code is working so far. However, it seems that it's relying on an undocumented behavior of the compiler. Is there any legit way to avoid the "argument never used" error?

#[inline]
fn secure(mut x: f64) -> f64 {
    unsafe {
        // Won't compile without `# {0}` due to an "argument never used" error.
        asm!("# {0}", inout(xmm_reg) x, options(nomem, nostack));
    }
    x
}

rustc --version --verbose:

rustc 1.46.0-nightly (feb3536eb 2020-06-09)
binary: rustc
commit-hash: feb3536eba10c2e4585d066629598f03d5ddc7c6
commit-date: 2020-06-09
host: x86_64-unknown-linux-gnu
release: 1.46.0-nightly
Full example code (should be built with --release)
#![feature(asm)]

use core::arch::x86_64::*;

#[inline]
fn secure(mut x: f64) -> f64 {
    unsafe {
        asm!("# {0}", inout(xmm_reg) x, options(nomem, nostack));
    }
    x
}

#[inline]
fn mul(x: f64, y: f64) -> f64 {
    secure(secure(x) * secure(y))
}

fn main() {
    unsafe {
        _MM_SET_ROUNDING_MODE(_MM_ROUND_UP);
    }

    assert_ne!(-mul(-1.1, 10.1), mul(1.1, 10.1));

    unsafe {
        _MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
    }
}

@bjorn3
Copy link
Member

bjorn3 commented Jun 10, 2020

_MM_SET_ROUNDING_MODE(_MM_ROUND_UP);

I don't think a blackbox is enough for LLVM to always use the rounding mode you want. I think LLVM is allowed to reset the rounding mode at any point in your code. I think your code is only guaranteed to work fine when you pass the -fp-model=strict LLVM argument, in which case the blackbox is not necessary anyway.

https://reviews.llvm.org/D62731

precise
By default, the compiler uses /fp:precise behavior.
[...]
The compiler generates code intended to run in the default floating-point environment and assumes that the floating-point environment is not accessed or modified at runtime. That is, it assumes that the code does not unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.
[...]
strict
[...]
Under /fp:strict, the compiler generates code that allows the program to safely unmask floating-point exceptions, read or write floating-point status registers, or change rounding modes.

@Amanieu
Copy link
Member Author

Amanieu commented Jun 10, 2020

See #72965 and #73056 for the issue with unused arguments.

@unageek
Copy link

unageek commented Jun 10, 2020

I don't think a blackbox is enough for LLVM to always use the rounding mode you want. I think LLVM is allowed to reset the rounding mode at any point in your code. I think your code is only guaranteed to work fine when you pass the -fp-model=strict LLVM argument, in which case the blackbox is not necessary anyway.

Thank you for your reply. -fp-model (or any equivalent option) does not seem to be implemented in rustc yet. Is there a tracking issue for that? (Sorry for being off-topic.)

See #72965 and #73056 for the issue with unused arguments.

Thank you for letting me know about them!

@joshtriplett
Copy link
Member

Another comment that seems relevant to capture:

For ARM, can we have the "." separators? They are standard for Aarch64 anyway, and they make the 32-bit code far easier to read. Like so:

add.s.ne
ldm.ia.cc

(in both orders, for those cases with two suffixes)

Are these supported by LLVM?

@bjorn3
Copy link
Member

bjorn3 commented Jun 10, 2020

Thank you for your reply. -fp-model (or any equivalent option) does not seem to be implemented in rustc yet. Is there a tracking issue for that? (Sorry for being off-topic.)

I thought -fp-model was an LLVM option. I just read the actual diff of patch I linked, and it turn out to be a clang option that causes it to emit LLVM float instructions with certain flags.

@Amanieu
Copy link
Member Author

Amanieu commented Jun 10, 2020

For ARM, can we have the "." separators? They are standard for Aarch64 anyway, and they make the 32-bit code far easier to read.

Those aren't separators, they are part of the instruction name on AArch64. They do not exist on 32-bit ARM.

kennystrawnmusic added a commit to kennystrawnmusic/tinypci that referenced this issue Jul 16, 2020
Per the bug report referenced in the above, `asm!` is back ― but it has a [brand-new syntax](https://doc.rust-lang.org/nightly/unstable-book/library-features/asm.html) associated with it that includes, among other things, `intel` and `volatile` by default. As if that's not enough, they're now deprecating `llvm_asm!` upstream in favor of this.
@thomcc
Copy link
Member

thomcc commented Dec 3, 2021

Given the caveats around labels with intel syntax (labels should be numeric, but can't be 0, 1, or a number made up of only those digits), is it worth getting an issue on file to at least improve the error messages? Is there one already?

@Amanieu
Copy link
Member Author

Amanieu commented Dec 3, 2021

We have a NamedAsmLabels lint which warns about using name labels, we could extend that to also warn about label names that look like binary numbers (only on x86 and with intel syntax).

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Dec 9, 2021
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 14, 2021
Stabilize asm! and global_asm!

Tracking issue: rust-lang#72016

It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature!

The main changes in this PR are:
- Removing `asm!` and `global_asm!` from the prelude as per the decision in rust-lang#87228.
- Stabilizing the `asm` and `global_asm` features.
- Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483).
  - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example.
- Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`.
- Updating `stdarch` and `compiler-builtins`.
- Updating all the tests.

r? `@joshtriplett`
flip1995 pushed a commit to flip1995/rust that referenced this issue Dec 17, 2021
Stabilize asm! and global_asm!

Tracking issue: rust-lang#72016

It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature!

The main changes in this PR are:
- Removing `asm!` and `global_asm!` from the prelude as per the decision in rust-lang#87228.
- Stabilizing the `asm` and `global_asm` features.
- Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483).
  - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example.
- Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`.
- Updating `stdarch` and `compiler-builtins`.
- Updating all the tests.

r? `@joshtriplett`
bors bot added a commit to RoaringBitmap/roaring-rs that referenced this issue Jan 12, 2022
127: Add scalar optimizations from CRoaring / arXiv:1709.07821 section 3 r=Kerollmops a=saik0

### Purpose

This PR adds some optimizations from CRoaring as outlined in  arXiv:1709.07821 section 3

### Overview 
 * All inserts and removes are now branchless (!in arXiv:1709.0782, in CRoaring)
 * Section 3.1 was already implemented, except for `BitmapIter`. This is covered in #125
 * Implement Array-Bitset aggregates as outlined in section 3.2
   * Also branchless 😎
 * Tracks bitmap cardinality while performing bitmap-bitmap ops
   * This is a deviation from CRoaring, and will need to be benchmarked further before this Draft PR is ready
   * Curious to hear what you think about this `@lemire` 
 * In order to track bitmap cardinality the len field had to moved into `Store::Bitmap`
   * This is unfortunately a cross cutting change
 * `Store` was quite large (LoC) and had many responsibilities. The largest change in this draft is decomposing `Store` such hat it's field variants are two new container types: each responsible for maintaining their invariants and implementing `ops`
   * `Bitmap8K` keeps track of it's cardinality
   * `SortedU16Vec` maintains its sorting
   * `Store` now only delegates to these containers
   * My hope is that this will be useful when implementing run containers. 🤞
   * Unfortunately so much code was  moved this PR is _HUGE_


### Out of scope
 * Inline ASM for Array-Bitset aggregates
 * Section 4 (explicit SIMD). As noted by the paper authors: The compiler does a decent job of autovectorization, though not as good as hand-tuned

### Notes
 * I attempted to emulate the inline ASM Array-Bitset aggregates by using a mix of unsafe ptr arithmetic and x86-64 intrinsics, hoping to compile to the same instructions. I was unable to get it under 13 instructions per iteration (compared to the papers 5). While it was an improvement, I abandoned the effort in favor of waiting for the `asm!` macro to stabilize. rust-lang/rust#72016

Co-authored-by: saik0 <github@saik0.net>
Co-authored-by: Joel Pedraza <github@saik0.net>
@bstrie
Copy link
Contributor

bstrie commented Jan 25, 2022

I see that there are some new feature flags in nightly for tracking the long tail of asm! stabilization: asm_const, asm_experimental_arch, asm_sym, and asm_unwind. Should we make dedicated tracking issues for these?

@Amanieu
Copy link
Member Author

Amanieu commented Jan 26, 2022

I created separate tracking issues for each sub-feature:

This tracking issue can be closed now that asm! is stable.

@Amanieu Amanieu closed this as completed Jan 26, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 17, 2022
Update tracking issue numbers for inline assembly sub-features

The main tracking issue for inline assembly is [closed](rust-lang#72016 (comment)), further tracking of the remaining sub-features has been moved to separate tracking issues.
not-jan pushed a commit to not-jan/roaring-rs that referenced this issue Aug 31, 2022
127: Add scalar optimizations from CRoaring / arXiv:1709.07821 section 3 r=Kerollmops a=saik0

### Purpose

This PR adds some optimizations from CRoaring as outlined in  arXiv:1709.07821 section 3

### Overview 
 * All inserts and removes are now branchless (!in arXiv:1709.0782, in CRoaring)
 * Section 3.1 was already implemented, except for `BitmapIter`. This is covered in RoaringBitmap#125
 * Implement Array-Bitset aggregates as outlined in section 3.2
   * Also branchless 😎
 * Tracks bitmap cardinality while performing bitmap-bitmap ops
   * This is a deviation from CRoaring, and will need to be benchmarked further before this Draft PR is ready
   * Curious to hear what you think about this `@lemire` 
 * In order to track bitmap cardinality the len field had to moved into `Store::Bitmap`
   * This is unfortunately a cross cutting change
 * `Store` was quite large (LoC) and had many responsibilities. The largest change in this draft is decomposing `Store` such hat it's field variants are two new container types: each responsible for maintaining their invariants and implementing `ops`
   * `Bitmap8K` keeps track of it's cardinality
   * `SortedU16Vec` maintains its sorting
   * `Store` now only delegates to these containers
   * My hope is that this will be useful when implementing run containers. 🤞
   * Unfortunately so much code was  moved this PR is _HUGE_


### Out of scope
 * Inline ASM for Array-Bitset aggregates
 * Section 4 (explicit SIMD). As noted by the paper authors: The compiler does a decent job of autovectorization, though not as good as hand-tuned

### Notes
 * I attempted to emulate the inline ASM Array-Bitset aggregates by using a mix of unsafe ptr arithmetic and x86-64 intrinsics, hoping to compile to the same instructions. I was unable to get it under 13 instructions per iteration (compared to the papers 5). While it was an improvement, I abandoned the effort in favor of waiting for the `asm!` macro to stabilize. rust-lang/rust#72016

Co-authored-by: saik0 <github@saik0.net>
Co-authored-by: Joel Pedraza <github@saik0.net>
@mert-kurttutan
Copy link

Hi @nagisa ,
I am still experiencing the same issue that @marysaka mentioned (in 1.73 stable and nightly) on x86 hardware, here .
I am trying out a Rust port of some C code. The C code uses inline assembly with memory inputs which has the syntax [x] "m" (x) where x is a variable defined in the current scope, and it is able to use variables from stack memory (instead of registers).
So, it is able to use less number of registers by loading multiple variables (from stack) to the same register in different parts of the code.

In the reference for inline assembly for Rust, only input operand type is that of register operand. Do we have similar method use variable as input from memory rather than register ?

@nagisa
Copy link
Member

nagisa commented Nov 1, 2023

Next time please open a new issue, or even better, ask on Zulip, rather than commenting on one that has been closed for more than a year at this point.


As far as I know, no, memory operands are considered a future possibility by the RFC that implemented inline assembly and would require some design & implementation work.

Your next best bet is to pass in an address to the variable in a register.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-inline-assembly Area: inline asm!(..) B-unstable Feature: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-asm `#![feature(asm)]` (not `llvm_asm`) finished-final-comment-period The final comment period is finished for this PR / Issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. WG-embedded Working group: Embedded systems
Projects
Status: Stabilized
Development

No branches or pull requests