Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC 3519: arbitrary_self_types #44874

Open
1 of 8 tasks
Tracked by #165
arielb1 opened this issue Sep 26, 2017 · 121 comments
Open
1 of 8 tasks
Tracked by #165

Tracking issue for RFC 3519: arbitrary_self_types #44874

arielb1 opened this issue Sep 26, 2017 · 121 comments
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. F-arbitrary_self_types `#![feature(arbitrary_self_types)]` S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. S-types-deferred Status: Identified as a valid potential future enhancement that is not currently being worked on T-lang Relevant to the language team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue.

Comments

@arielb1
Copy link
Contributor

arielb1 commented Sep 26, 2017

This is the tracking issue for RFC 3519: Arbitrary self types v2.

The feature gate for this issue is #![feature(arbitrary_self_types)].

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

None.

Related

Implementation history

TODO.


(Below follows content that predated the accepted Arbitrary Self Types v2 RFC.)

  • figure out the object safety situation
  • figure out the handling of inference variables behind raw pointers
  • decide whether we want safe virtual raw pointer methods

Object Safety

See #27941 (comment)

Handling of inference variables

Calling a method on *const _ could now pick impls of the form

impl RandomType {
    fn foo(*const Self) {}
}

Because method dispatch wants to be "limited", this won't really work, and as with the existing situation on &_ we should be emitting an "the type of this value must be known in this context" error.

This feels like fairly standard inference breakage, but we need to check the impact of this before proceeding.

Safe virtual raw pointer methods

e.g. this is UB, so we might want to force the call <dyn Foo as Foo>::bar to be unsafe somehow - e.g. by not allowing dyn Foo to be object safe unless bar was an unsafe fn

trait Foo {
    fn bar(self: *const Self);
}

fn main() {
    // creates a raw pointer with a garbage vtable
    let foo: *const dyn Foo = unsafe { mem::transmute([0usize, 0x1000usize]) };
    // and call it
    foo.bar(); // this is UB
}

However, even today you could UB in safe code with mem::size_of_val(foo) on the above code, so this might not be actually a problem.

More information

There's no reason the self syntax has to be restricted to &T, &mut T and Box<T>, we should allow for more types there, e.g.

trait MyStuff {
    fn do_async_task(self: Rc<Self>);
}

impl MyStuff for () {
    fn do_async_task(self: Rc<Self>) {
        // ...
    }
}

Rc::new(()).do_async_stuff();
@arielb1 arielb1 changed the title Allow methods with arbitrary self-types Allow trait methods with arbitrary self-types Sep 26, 2017
@aidanhs aidanhs added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Sep 28, 2017
@porky11
Copy link

porky11 commented Sep 28, 2017

Why would you need this?
Why wouldn't you write an impl like this:

impl MyStuff for Rc<()> {
    fn do_async_task(self) {
        // ...
    }
}

I'd rather define the trait different. Maybe like this:

trait MyStuff: Rc {
    fn do_async_task(self);
}

In this case, Rc would be a trait type. If every generic type implemented a specific trait (this could be implemented automatically for generic types) this seems more understandable to me.

@cuviper
Copy link
Member

cuviper commented Sep 28, 2017

This could only be allowed for trait methods, right?

For inherent methods, I can't impl Rc<MyType>, but if impl MyType can add methods with self: Rc<Self>, it seems like that would enable weird method shadowing.

@arielb1
Copy link
Contributor Author

arielb1 commented Oct 1, 2017

@cuviper

This is still pending lang team decisions (I hope there will be at least 1 RFC) but I think it will only be allowed for trait method impls.

@arielb1
Copy link
Contributor Author

arielb1 commented Oct 1, 2017

@porky11

You can't implement anything for Rc<YourType> from a crate that does not own the trait.

@arielb1
Copy link
Contributor Author

arielb1 commented Oct 2, 2017

So changes needed:

  • remove the current error message for trait methods only, but still have a feature gate.
  • make sure fn(self: Rc<Self>) doesn't accidentally become object-safe
  • make sure method dispatch woks for Rc<Self> methods
  • add tests

@arielb1 arielb1 added E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. and removed E-needs-mentor labels Oct 2, 2017
@SimonSapin
Copy link
Contributor

I’ll look into this.

@arielb1
Copy link
Contributor Author

arielb1 commented Oct 2, 2017

Note that this is only supported to work with trait methods (and trait impl methods), aka

trait Foo {
    fn foo(self: Rc<Self>);
}
impl Foo for () {
    fn foo(self: Rc<Self>) {}
}

and is NOT supposed to work for inherent impl methods:

struct Foo;
impl Foo {
    fn foo(self: Rc<Self>) {}
}

@SimonSapin
Copy link
Contributor

I got caught in some more Stylo work that's gonna take a while, so if someone else wants to work on this in the meantime feel free.

@kennytm
Copy link
Member

kennytm commented Oct 6, 2017

Is this supposed to allow any type as long as it involves Self? Or must it impl Deref<Target=Self>?

trait MyStuff {
    fn a(self: Option<Self>);
    fn b(self: Result<Self, Self>);
    fn c(self: (Self, Self, Self));
    fn d(self: Box<Box<Self>>);
}

impl MyStuff for i32 {
   ...
}

Some(1).a();  // ok?
Ok(2).b();  // ok?
(3, 4, 5).c(); // ok?
(box box 6).d(); // ok?

kennytm added a commit to kennytm/rust that referenced this issue Oct 10, 2017
…ents, r=nikomatsakis

Update comments referring to old check_method_self_type

I was browsing the code base, trying to figure out how rust-lang#44874 could be implemented, and noticed some comments that were out of date and a bit misleading (`check_method_self_type` has since been renamed to `check_method_receiver`). Thought it would be an easy first contribution to Rust!
@mikeyhew
Copy link
Contributor

I've started working on this issue. You can see my progress on this branch

@mikeyhew
Copy link
Contributor

mikeyhew commented Nov 2, 2017

@arielb1 You seem adamant that this should only be allowed for traits and not structs. Aside from method shadowing, are there other concerns?

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 2, 2017

@mikeyhew

inherent impl methods are loaded based on the type. You shouldn't be able to add a method to Rc<YourType> that is usable without any use statement.

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 2, 2017

That's it, if you write something like

trait Foo {
    fn bar(self: Rc<Self>);
}

Then it can only be used if the trait Foo is in-scope. Even if you do something like fn baz(self: u32); that only works for modules that use the trait.

If you write an inherent impl, then it can be called without having the trait in-scope, which means we have to be more careful to not allow these sorts of things.

@mikeyhew
Copy link
Contributor

mikeyhew commented Nov 3, 2017

@arielb1 Can you give an example of what we want to avoid? I'm afraid I don't really see what the issue is. A method you define to take &self will still be callable on Rc<Self>, the same as if you define it to take self: Rc<Self>. And the latter only affectsRc<MyStruct>, not Rc<T> in general.

@mikeyhew
Copy link
Contributor

mikeyhew commented Nov 4, 2017

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types. Basically we need a way to take a CustomPointer<Trait>, and do two things: (1) extract the vtable, so we can call the method, and (2) turn it into a CustomPointer<T> without knowing T.

(1) is pretty straightforward: call Deref::deref and extract the vtable from that. For (2), we'll effectively need to do the opposite of how unsized coercions are implemented for ADTs. We don't know T, but we can can coerce to CustomPointer<()>, assuming CustomPointer<()> has the same layout as CustomPointer<T> for all T: Sized. (Is that true?)

The tough question is, how do we get the type CustomPointer<()>? It looks simple in this case, but what if CustomPointer had multiple type parameters and we had a CustomPointer<Trait, Trait>? Which type parameter do we switch with ()? In the case of unsized coercions, it's easy, because the type to coerce to is given to us. Here, though, we're on our own.

@arielb1 @nikomatsakis any thoughts?

@nikomatsakis
Copy link
Contributor

@arielb1

and is NOT supposed to work for inherent impl methods:

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

@nikomatsakis
Copy link
Contributor

@mikeyhew

I've been trying to figure out how we can support dynamic dispatch with arbitrary self types.

I do want to support that, but I expected it to be out of scope for this first cut. That is, I expected that if a trait uses anything other than self, &self, &mut self, or self: Box<Self> it would be considered no longer object safe.

@mikeyhew
Copy link
Contributor

mikeyhew commented Nov 4, 2017

@nikomatsakis

I do want to support that, but I expected it to be out of scope for this first cut.

I know, but I couldn't help looking into it, it's all very interesting to me :)

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 5, 2017

Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)

We need some sort of "orphan rule" to at least prevent people from doing things like this:

struct Foo;
impl Foo {
    fn frobnicate<T>(self: Vec<T>, x: Self) { /* ... */ }
}

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I think that if we have the deref-back requirement, there's no problem with allowing inherent methods - we just need to change inherent method search a bit to also look at defids of derefs. So that's probably a better idea than restricting to trait methods only.

Note that the CoerceSized restriction for object safety is orthogonal if we want allocators:

struct Foo;
impl Tr for Foo {
    fn frobnicate<A: Allocator+?Sized>(self: RcWithAllocator<Self, A>) { /* ... */ }
}

Where an RcWithAllocator<Self, A> can be converted to a doubly-fat RcWithAllocator<Tr, Allocator>.

@mikeyhew
Copy link
Contributor

mikeyhew commented Nov 5, 2017

@arielb1

Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.

Are saying is that there would be a "conflicting symbols for architechture x86_64..." linker error?

Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

@arielb1
Copy link
Contributor Author

arielb1 commented Nov 5, 2017

I'm confused, are you still talking about frobnicate here, or have you moved on to the vtable stuff?

The deref-back requirement is supposed to be for everything, not only object-safety. It prevents the problem when one person does

struct MyType;
impl MyType {
    fn foo<T>(self: Vec<(MyType, T)>) { /* ... */ }
}   

While another person does

struct OurType;
impl OurType {
    fn foo<T>(self: Vec<(T, OurType)>) {/* ... */ }
}   

And now you have a conflict on Vec<(MyType, OurType)>. If you include the deref-back requirement, there is no problem with allowing inherent impls.

@madsmtm
Copy link
Contributor

madsmtm commented May 5, 2023

Hey @adetaylor, I'm sorry I didn't really understand what you wrote above, I'm not well-enough versed in rustc to know such details.

What I wanted to discuss was basically that, if we stabilize arbitary_self_types (close to) as-is, what would be the expected semantics of the following in the future when we add MethodReceiver?

struct MyBox<T>(T);

impl<T> Deref<Target = T> for MyBox<T> { ... }
impl<T> DerefMut for MyBox<T> { ... }
unsafe impl<T> MethodReceiver for MyBox<T> { ... }

I would expect it be a compile-error, because MyBox<T>: Deref means that MethodReceiver is already implemented. Or would we expect something different? If so, what would the semantics be?


Maybe, as a library implementer, I'm just confused about your proposal for MethodReceiver: Why does it need the associated Target type and the receiver method? In my eyes, the trait could simply be used as:

// Crate foo
pub struct FooPtr<T>(NonNull<T>);

unsafe impl<T> MethodReceiver for FooPtr<T> {}
// Does not implement `Deref` and `DerefMut`, since that is undesired for this type

// Crate bar
struct Bar;

impl Bar {
    // Now possible, because of the `MethodReceiver` implementation.
    fn method(self: FooPtr<Self>) { ... }
}

Is it just because the other definition would allow handling *const T and *mut T cleaner? Or is there some other reason?

@moulins
Copy link
Contributor

moulins commented May 5, 2023

Maybe, as a library implementer, I'm just confused about your proposal for MethodReceiver: Why does it need the associated Target type and the receiver method?

Without the Target type, how do you know which type(s) to use for the method dispatch?


However, I'm not sure what the purpose of the receiver method is: once method lookup is done and we know what method to call, the MethodReceiver::Target type doesn't matter anymore, and in particular there is no need to obtain an actual *const R::Target at runtime.

Additionally, why restrict the number of MethodReceiver "coercions" to at most one?

// Assuming these:
impl<T: ?Sized> MethodReceiver for Cell<T> { type Target = T; }
impl<T: ?Sized> MethodReceiver for *const T { type Target = T; }

// One should be able to write:
struct Foo;
impl Foo {
  fn weird(self: *const Cell<Self>) { ... }
}

@Imberflur
Copy link

However, I'm not sure what the purpose of the receiver method is: once method lookup is done and we know what method to call, the MethodReceiver::Target type doesn't matter anymore, and in particular there is no need to obtain an actual *const R::Target at runtime.

AIUI, the pointer from the receiver method is necessary to get the vtable pointer when calling methods on a trait object. E.g.:

trait MyTrait {
    fn foo(self: Rc<Self>);
}

impl MyTrait for () {
    fn foo(self: Rc<Self>) { ... }
}

let bar: Rc<dyn MyTrait> = Rc::new(());
bar.foo(); // we need to find method at runtime via the vtable

@moulins
Copy link
Contributor

moulins commented May 6, 2023

AIUI, the pointer from the receiver method is necessary to get the vtable pointer when calling methods on a trait object.

Isn't that handled by the DispatchFromDyn mechanism? Also, I don't think custom method receivers should be required to be object-safe, as this would disallow potentially useful types like self: RefMut<'_, Self> (which currently cannot be object-safe, as it contains more than one non-ZST field).

@adetaylor
Copy link
Contributor

Hi @madsmtm ,

I would expect it be a compile-error, because MyBox<T>: Deref means that MethodReceiver is already implemented. Or would we expect something different? If so, what would the semantics be?

You're assuming a blanket implementation of MethodReceiver for Deref, and I wasn't assuming that, so thanks for highlighting this - something to figure out during the pre-RFC process. I will add exactly this question to the blank pre-pre-RFC linked above.

@faucct
Copy link

faucct commented May 13, 2023

This generic syntax also works under the flag and it would be nice to get it stabilized: Actually, it makes the compiler panic:

impl RandomType {
  async fn foo(self: &(impl Deref<Target=Self> + Clone + Send + 'static)) {
  }
}

This way both Arc<_> and Box::leak are accepted. The Arc variant is being suggested to use with tokio::spawn, when trying to create futures referencing self, though Box::leak could also work.

@BattyBoopers
Copy link

BattyBoopers commented May 24, 2023

I'd like to mention this issue here.
The ICE was fixed with an error message, but I'd actually like to see this kind of usage supported:

#![feature(arbitrary_self_types)]
use std::ops::Deref;

struct Foo(u32);
impl Foo {
    fn get<R: Deref<Target=Self>>(self: R) -> u32 {
        self.0
    }
}

fn main() {
    let mut foo = Foo(1);
    foo.get::<&Foo>(); // Error
}

The issue being that the self parameter is generic, however, in the call the turbofish operator is used to specify which exact argument type is being used.
This doesn't seem to work together with coercing the self argument into the correct reference type though, which I'd argue it should do.

@davidhewitt
Copy link
Contributor

@adetaylor I just wanted to say thank you for exploring an RFC for this, arbitrary self types provide an elegant solution for the main challenge I would like to resolve before a PyO3 1.0 (i.e. the Python interop you note above).

If there's anything I can do to help you, please ping me!

@adetaylor
Copy link
Contributor

Thanks also to @Urhengulas and @amanjeev who are working on it too!

@oli-obk
Copy link
Contributor

oli-obk commented Apr 8, 2024

This issue has become impossible to follow (150 comments, most of them hidden by github).

Tracking issues in general have a tendency to become unmanageable. Please open a dedicated new issue and label it with F-arbitrary_self_types `#![feature(arbitrary_self_types)]` for absolutely any topics you want to discuss or have questions about. See rust-lang/compiler-team#739 for details and discussions on this prospective policy.

@rust-lang rust-lang unlocked this conversation Apr 15, 2024
@traviscross traviscross added the B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. label May 6, 2024
@traviscross traviscross changed the title Tracking issue for arbitrary_self_types Tracking issue for RFC 3519: arbitrary_self_types May 6, 2024
@traviscross
Copy link
Contributor

traviscross commented May 6, 2024

For all those following this tracking issue, note that RFC 3519 ("Arbitrary self types v2") has been accepted and merged:

We'll continue to use this issue to track the progress toward this second version of arbitrary self types.

Thanks to @adetaylor for pushing forward on this important work.

@adetaylor
Copy link
Contributor

Here's how I plan to approach actually implementing this.

For the immediate future, I'll be working in this branch, rebasing and rewriting history frequently. Bits of that branch will occasionally migrate to PRs which I try to land.

The tricky bit about landing this work is that there are already a cluster of feature gates involved, and we need to adjust them towards this agreed end-state incrementally.

I think the phasing of PRs has to be something like this:

  1. Fix the current lifetime elision bugs per self: &Box<Self> produces confusing error due to failure to spot elided lifetime #117715 and corresponding PR. That got stalled waiting on me as I was dealing with the RFC.

  2. Search for potentially conflicting method candidates per the deshadowing part of the RFC. This is the riskiest part - per

    the production of errors requires us to interrogate more candidates to look for potential conflicts, so this could have a compile-time performance penalty which we should measure

    let's attempt to land this first and see if performance issues show up. If so, it's back to the drawing board.

  3. Block generic arbitrary self types. The RFC discussion process suggested that the complexities were too much, and the consensus was to block them with new diagnostics. They're allowed by the current arbitrary_self_types feature gate. We should first of all remove that ability from the existing arbitrary self types code, which is throwaway work, but it will be good to find out early whether this has ecosytem concerns (i.e. anyone relies on it - obviously these are folks relying on the existing unstable feature so we can break them, but we want to know).

  4. Introduce an arbitrary_self_types_pointers feature gate, and move the existing arbitrary self types pointer support to that. In a sense this is temporary throwaway work, since the existing arbitrary self types mechanism will be replaced later in the sequence, but this frees up the arbitrary_self_types feature gate to control what we want. Again, we're doing this early to see if this causes unexpected problems for existing crates using the existing unstable feature.

  5. Rename the old Receiver trait. Ideally without bikeshedding on the name too much, but we'll see how that goes. Fortunately it should only exist temporarily. This trait should only have been used in the standard library, but doubtless a crater run will yield a surprise or two.

  6. Land the new Receiver trait without it doing anything. (I assume it's possible to land things in the standard library even if they're not yet used by the compiler). Include thorough documentation on the restrictions around methods.

  7. Switch over the method resolution to use the Receiver trait, if the arbitrary_self_types feature is enabled. The main event.

  8. Add diagnostics for the !Sized case and the NonNull etc. cases.

  9. Update the Rust reference.

  10. Once the dust has settled, a few releases later, look into stabilizing the arbitrary_self_types feature gate.

Help with any of these stages is very much appreciated - as is any wisdom about whether there's a better order to do things, or whether things should be split up into more or fewer PRs - it's my first major Rust change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. F-arbitrary_self_types `#![feature(arbitrary_self_types)]` S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. S-types-deferred Status: Identified as a valid potential future enhancement that is not currently being worked on T-lang Relevant to the language team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests