Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink from_raw_parts's MIR so that Vec::deref MIR-inlines again #123190

Closed

Conversation

scottmcm
Copy link
Member

Fixes #123174
cc @CAD97

Two commits; the first adds the codegen test so that you can see the diff clearly in the second commit.

@scottmcm scottmcm added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Mar 29, 2024
@rustbot
Copy link
Collaborator

rustbot commented Mar 29, 2024

r? @m-ou-se

rustbot has assigned @m-ou-se.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 29, 2024
@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm force-pushed the simplify-from-raw-parts-mir branch from fcea56f to 3e4a577 Compare March 29, 2024 09:37
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@saethlin saethlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new MIR looks a lot nicer; the previous MIR I think had a lot of union juggling that we don't have a MIR opt to clean up. The codegen test here demonstrates the improvement to inlining but can you post the before and after Vec::deref so everyone can see?

@RalfJung
Copy link
Member

Hm, I'm not happy with making our libcore a tangled mess just to get the MIR a bit nicer. That's a maintenance nightmare...

@saethlin
Copy link
Member

🤷 I'm leaving that decision up to libs. It's possible to get the inlining here by other means, but I'm not sure what MIR optimizations are required to clean up from_raw_parts or if we want them.

@scottmcm scottmcm force-pushed the simplify-from-raw-parts-mir branch 2 times, most recently from a3bb26f to e27ce30 Compare March 29, 2024 12:53
@scottmcm
Copy link
Member Author

scottmcm commented Mar 29, 2024

I've pushed a rearrangement of things to attempt to address the "tangled mess" concerns.

There's now a (non-exported) ptr::internal_repr module which is the only part allowed to know about the layout. It exports two safe pub(super) (aka pub(in crate::ptr)) functions which are used elsewhere in the ptr module to implement the various *from_raw_parts* and metadata functions.

The resulting diffs outside the new module, such as

 pub const fn slice_from_raw_parts<T>(data: *const T, len: usize) -> *const [T] {
-    from_raw_parts(data.cast(), len)
+    internal_repr::from_raw_parts(data, len)
 }

seem entirely fine to me.

Like the previous iteration, there's no more need for PtrRepr. Instead, there's just PtrComponents which is transmuted to/from. And because it's only ever transmuted, the Copy/Clone impls are removed as unnecessary.

Personally I rather like it, since it lets stuff elsewhere in the library keep using the pointers they already have, rather than needing to cast them or wrap/unwrap them. (For all that an extra PtrToPtr cast doesn't really matter, it's 1/20 of the default #[inline] budget.)

@rust-log-analyzer

This comment has been minimized.

@scottmcm
Copy link
Member Author

scottmcm commented Mar 29, 2024

#122975 means that what you see below is very different from the previous nightly, but that change on its own isn't enough to get Vec::deref to inline.

Before this PR, with adding -Z inline-mir-hint-threshold=9999 to the test to force seeing it:

    bb0: {
        StorageLive(_4);
        StorageLive(_2);
        _2 = &((*_1).0: alloc::raw_vec::RawVec<u8>);
        StorageLive(_3);
        _3 = ((((*_1).0: alloc::raw_vec::RawVec<u8>).0: std::ptr::Unique<u8>).0: std::ptr::NonNull<u8>);
        _4 = (_3.0: *const u8);
        StorageDead(_3);
        StorageDead(_2);
        StorageLive(_5);
        _5 = ((*_1).1: usize);
        StorageLive(_6);
        _6 = _4 as *const () (PtrToPtr);
        StorageLive(_8);
        StorageLive(_7);
        _7 = std::ptr::metadata::PtrComponents::<[u8]> { data_pointer: _6, metadata: _5 };
        _8 = std::ptr::metadata::PtrRepr::<[u8]> { const_ptr: move _7 };
        StorageDead(_7);
        _9 = (_8.0: *const [u8]);
        StorageDead(_8);
        StorageDead(_6);
        StorageDead(_5);
        StorageDead(_4);
        _0 = &(*_9);
        return;
    }

After this PR:

    bb0: {
        StorageLive(_4);
        StorageLive(_2);
        _2 = &((*_1).0: alloc::raw_vec::RawVec<u8>);
        StorageLive(_3);
        _3 = ((((*_1).0: alloc::raw_vec::RawVec<u8>).0: std::ptr::Unique<u8>).0: std::ptr::NonNull<u8>);
        _4 = (_3.0: *const u8);
        StorageDead(_3);
        StorageDead(_2);
        StorageLive(_5);
        _5 = ((*_1).1: usize);
        StorageLive(_6);
        _6 = std::ptr::internal_repr::PtrComponents::<*const u8, usize> { data_pointer: _4, metadata: _5 };
        _7 = move _6 as *const [u8] (Transmute);
        StorageDead(_6);
        StorageDead(_5);
        StorageDead(_4);
        _0 = &(*_7);
        return;
    }

After the post-inlining simplifications the difference is just that one less cast and a Transmute instead of the PtrRepr cast-via-union. But two statements fewer is all that was needed -- and, well, if you don't count the Storage statements 2/9 is not a trivial percentage.

@scottmcm scottmcm force-pushed the simplify-from-raw-parts-mir branch from e27ce30 to 013c383 Compare March 29, 2024 13:29
@RalfJung
Copy link
Member

Should we have a pass that turns transmute-via-union into a transmute statement?

@scottmcm
Copy link
Member Author

I'd love to have a real InstCombine pass that can easily look at multiple statements at once.

Any progress on one since #105808?

@saethlin
Copy link
Member

Not from me. Anyone can resurrect the general strategy in that PR, so long as they have a fix for the storage liveness issue @cjgillot pointed out. It's somewhere in the PR comments, I can't load the page right now to find it.

@cjgillot
Copy link
Contributor

Taking a step back, we may need to rethink how we create and use wide pointers in MIR.
To construct them, we have Cast (Unsize) and the union that you are trying to remove.
To deconstruct them, we have casts to thin pointers, Len of dereference and that same union.

I wonder if we should go towards:

  • construction = Cast (Unsize) and a dedicated AggregateKind::WidePointer, making from_raw_parts an intrinsic;
  • destruction = cast to thin pointer and Rvalue::Metadata which works on the pointer and not the pointee.

The fact that Len needs to dereference obfuscates what happens, and Metadata on the pointer would make it clear there is no dereference involved. I have not investigated the effect on MIR building though.

Vec::deref would look something like:

    bb0: {
        StorageLive(_4);
        StorageLive(_2);
        _2 = &((*_1).0: alloc::raw_vec::RawVec<u8>);
        StorageLive(_3);
        _3 = ((((*_1).0: alloc::raw_vec::RawVec<u8>).0: std::ptr::Unique<u8>).0: std::ptr::NonNull<u8>);
        _4 = (_3.0: *const u8);
        StorageDead(_3);
        StorageDead(_2);
        StorageLive(_5);
        _5 = ((*_1).1: usize);
        StorageLive(_6);
        _6 = _4 as *const () (PtrToPtr);
        StorageLive(_7);
        _7 = WidePointer::<[u8]> { data_pointer: _6, metadata: _5 };
        StorageDead(_6);
        StorageDead(_5);
        _0 = &(*_8);
        StorageDead(_7);
        return;
    }

Bonus: this may allow to simplify slice construction, as current GVN is unable to understand the union dance.

@saethlin
Copy link
Member

It doesn't make much sense to have slice construction implemented in the library like it is now. I'm pretty sure -Zrandomize-layout cannot change the field order in slices because it can't fix up the slice construction logic, because it's in the library based on a repr(C) struct. We should fix that.

@bors
Copy link
Contributor

bors commented Apr 5, 2024

☔ The latest upstream changes (presumably #123484) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm
Copy link
Member Author

Closing in favour of #123840

@scottmcm scottmcm closed this Apr 12, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
…gillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 21, 2024
…cjgillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 21, 2024
…cjgillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
Rollup merge of rust-lang#123840 - scottmcm:aggregate-kind-rawptr, r=cjgillot

Add an intrinsic for `ptr::from_raw_parts(_mut)`

Fixes rust-lang#123174
cc `@CAD97` `@saethlin`
r? `@cjgillot`

As suggested in rust-lang#123190 (comment), this adds a new `AggregateKind::RawPtr` for creating a pointer from its data pointer and its metadata.

That means that `slice::from_raw_parts` and friends no longer need to hard-code pointer layout into `libcore`, and because it no longer does union hacks the MIR is shorter and more amenable to optimizations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UB Check blocks MIR inlining of Vec::deref
8 participants