Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Borrows: Retag (private) fields of ADTs? #125

Closed
RalfJung opened this issue Apr 25, 2019 · 21 comments
Closed

Stacked Borrows: Retag (private) fields of ADTs? #125

RalfJung opened this issue Apr 25, 2019 · 21 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions

Comments

@RalfJung
Copy link
Member

RalfJung commented Apr 25, 2019

The following code currently gets rejected by Miri:

use std::cell::{RefCell, Ref};

fn break_it(rc: &RefCell<i32>, r: Ref<'_, i32>) {
    // `r` has a shared reference, it is passed in as argument and hence
    // a protector is added that marks this memory as read-only for the entire
    // duration of this function.
    drop(r);
    // *oops* here we can mutate that memory.
    *rc.borrow_mut() = 2;
}

fn main() {
    let rc = RefCell::new(0);
    break_it(&rc, rc.borrow())
}

A similar issue exists with RefMut, and vec_deque::Drain also has this problem.

In each of these cases, a protector gets added for a reference that is stored in a private field, and that reference gets invalidated while the protector is still active.

Another way to phrase is: Are types allowed to "lie" about the lifetime of references stored in private fields? Also see rust-lang/rust-memory-model#5.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 2, 2019

With rust-lang/miri#872, retags no longer descend into fields of structs/enums/...

In particular, no barriers/protectors are added for references passed inside other types. This fixes the problem here. But I will leave the issue open because, it is very unclear if that is the right solution -- I implemented it because it helps not reject code that we are not sure is wrong.

bors added a commit to rust-lang/miri that referenced this issue Aug 2, 2019
Make Retag shallow

A shallow retag does not traverse into fields of compound typed to search for references to retag. It only retags "top-level"/"bare" references (and boxes).

This helps with rust-lang/unsafe-code-guidelines#125 because it also means that we do not add protectors for references passed in fields of a struct (or other compound types). Until we know what the rules should be for protectors, I prefer to be less aggressive about what we are rejecting.
This also matches our work-in-progress Coq formalization.
@Lokathor
Copy link
Contributor

Lokathor commented Aug 2, 2019

I'm not super fluent with RefCell stuff, but near as I can tell, break_it should be valid as long as the Ref and the RefCell aren't related. Similarly, main looks like it's totally fine on its own.

Since this is all safe code, then there should be a panic instead of UB when *rc.borrow_mut() = 2; is attempted. Since my guess is that there's no way to make a panic happen correctly in that situation, I think that functions need to relax their barriers until it works right.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 2, 2019

This is all safe code, so if there is UB it's RefCell's fault. What it could do to avoid UB is to use raw pointers in Ref and RefMut. In some sense that is more correct, but so far it is not clear whether we want to require that.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 2, 2019

Yeah, I know that in some cases RefCell type can panic to avoid UB, I'm not sure if it's possible here.

If I understand you correctly, the problem that RefCell has is that the LLVM IR for it is being too strict, but RefCell has no way to examine what the IR is of course.

Since something like RefCell can be made by any user of Rust, I think it's better to fix the LLVM IR generation for this sort of situation (if possible) instead of trying to explain this edge case to every user of Rust ever.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 2, 2019

No, this shouldn't panic. This is an intended usecase for RefCell.

the LLVM IR for it is being too strict, but RefCell has no way to examine what the IR is of course.

Not sure where you are getting that from, LLVM IR has not been mentioned in this thread.
Stacked Borrows is too strict.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 2, 2019

Ah, I thought that the "Barriers for private fields" was an LLVM IR element.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2019

Turns out what we actually do emit noalias for some types that are not references but contain them, such as Pin<&mut T>. Visibility does not affect this.

@eddyb @rkruppe do you know for which types we do this? Is it only newtypes? If we also did this e.g. for Ref, RefMut or vec_deque::Drain, that would be plain wrong. The code for this is here but I cannot really turn that into a high-level description of when we do or do not add noalias.

@RalfJung
Copy link
Member Author

At least Ref and RefMut actually do get noalias... ouch. I'll turn this into a rustc bugreport, for Ref I think we can get UB here even without -Zmutable-noalias.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

Turns out what we actually do emit noalias for some types that are not references but contain them, such as Pin<&mut T>.

At least Ref and RefMut actually do get noalias

I'm not seeing that here: https://rust.godbolt.org/z/69WSrn . We do emit noalias for Pin<&T> and Ref, but not Pin<&mut T> or RefMut.

AFAICT, the rule is that if an Adt field is noalias and the ABI of the Adt is Pointer, then the Adt is noalias. For some reason we don't emit noalias for Unique..

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

Only when enabling -Zmutable-noalias (https://rust.godbolt.org/z/fmDt0o) we do see noalias being emitted for &mut T, Pin<&mut T> and RefMut.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2019

@gnzlbg you need to set -Zmutable-noalias (which I forgot to mention in this thread).

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2019

That flag exists solely because of LLVM bugs, see rust-lang/rust#54878. Rust must still generate correct (UB-free) LLVM IR even when that flag is set.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 21, 2019

Sure, I just tried your examples out and wasn't seeing the noalias.

For Ref there is UB without -Zmutable-noalias (https://rust.godbolt.org/z/EWZ85U):

fn foo(rc: &RefCell<i32>, r: Ref<'_, i32>) { ... }

produces

define void @foo(
    { i64, i32 }* nocapture readnone align 8 dereferenceable(16) %rc, 
    i32* noalias readonly align 4 dereferenceable(4), i64* align 8 dereferenceable(8) %r)
{ ... }

which is incorrect since %r aliases %rc.1 .

@RalfJung
Copy link
Member Author

RalfJung commented Aug 21, 2019

@gnzlbg here's a concrete example turning this into UB: rust-lang/rust#63787.

@eddyb
Copy link
Member

eddyb commented Aug 22, 2019

Is Ref having noalias readonly on the data reference a problem because, unlike a regular reference, it can be invalidated by Drop?

Could we then rely on the presence of a Drop impl to disable noalias readonly, or should Ref itself use a raw pointer or &UnsafeCell?

EDIT: oops, the discussion is over at rust-lang/rust#63787 (comment).

@Diggsey
Copy link

Diggsey commented Oct 16, 2020

With references, we generally don't distinguish between a move and a copy/reborrow as they are Copy types, but perhaps this is a case where that distinction matters?

If the reference is being moved into break_it, then it seems to me that RefCell is correct as-is. For SB, the item corresponding to that reference should be moved onto the other side of the stack protector when the call occurs. Similarly, if a reference is returned from a function, it should be moved back to the original side of the stack protector.

If instead the reference is being copied/reborrowed into the function, then RefCell is incorrect, because the original reference is still in existence.

So: when is the reference moved vs reborrowed/copied? I think that would depend on whether the struct containing it is reborrowed/copied, and that depends on whether the struct is itself Copy. If the struct is not Copy, then the reference should be considered to move rather than be copied/reborrowed.

Admittedly, this line of thinking leads to some rather bizarre behaviour: generic code could become UB when concrete types are plugged in, or if additional trait bounds are added:

fn move<T>(x: T) {
    break_it(x);
}

->

fn reborrow<T: Copy>(x: T) {
    break_it(x); // UB because `x` is not a move anymore!
}

So instead, it may be better to be more conservative, and treat more things as moves, even when they could be reborrowes/copies (the cost being that optimisations depending on the stack protectors would not be possible in those cases).

@chorman0773
Copy link
Contributor

In lccc, there is a mechanism to indicate that the validity of a pointer does not continue past a particular point (a so-called "destroy" operation). That would make the above code sound (and thus RefCell itself), since it invalidates the shared reference in r before reborrowing the RefCell as mutable. Would it be reasonable for Stacked Borrows to have something like this for these situations. For example, an "end protector" operation could be used to the same effect, and provide a valid way for rust code to pop a protected item.

@RalfJung
Copy link
Member Author

RalfJung commented Dec 5, 2020

I am not sure how that's supposed to work -- are you saying there should be a new primitive that programmers can use to invoke this operation, or is the operation somehow inserted automatically at certain places?

@chorman0773
Copy link
Contributor

It would be a primitive operation that could be inserted in particular places, such as the destructors of types like Box in particular, but also extending to Ref and RefMut.
For user code, I think it would be desirable to have the operation available, but not necessarily imperative right now. This could be in the form of a field attribute, or an attribute on the drop function.

@RalfJung
Copy link
Member Author

Some updates:

  • In Miri, we are pretty close to enabling retagging of all fields by default: Tracking issue for enabling field retagging by default miri#2528.
  • @saethlin has raised concerns that retagging all fields might be too much UB, and maybe we should only do that for repr(transparent) types. (I can't find a link to that discussion now.)
  • Codegen currently assumes it happens for all types with Scalar and ScalarPair ABI, which is a lot more than just repr(transparent) types.
  • @pcwalton expressed interest on optimizations for data passed in 'tuples of immutable references', which would boil down to doing retagging for the elements of arbitrairly-sized tuples -- and presumably structs, too, since it'd be strange to treat them differently. So that would mean full recursive retagging of everything.

@JakobDegen JakobDegen added the S-pending-design Status: Resolving this issue requires addressing some open design questions label Aug 1, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Apr 1, 2024

Miri has enabled field retagging for a while now, and it did not trigger an apocalypse. I think we can close this now, as this makes Miri match long-standing precedent of rustc itself. If someone wants to propose to change this, that should be a new issue with accompanying motivation.

@RalfJung RalfJung closed this as completed Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions
Projects
None yet
Development

No branches or pull requests

7 participants