Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect fmt::Pointer implementations #328

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

Conversation

tyranron
Copy link
Collaborator

@tyranron tyranron commented Dec 23, 2023

Synopsis

Debug and Display derives allow referring fields via short syntax (_0 for unnamed fields and name for named fields):

#[derive(Display)]
#[display("{_0:o}")]
struct OctalInt(i32);

The way this works is by introducing a local binding in the macro expansion:

let _0 = &self.0;

This, however, introduces double pointer indirection. For most of the fmt traits, this is totally OK. However, the fmt::Pointer is sensitive to that:

#[derive(Display)]
#[display("--> {_0:p}")]
struct Int(&'static i32);

// expands to
impl fmt::Display for Int {
    fn fmt(&self, f: fmt::Formatter<'_>) -> fmt::Result {
        let _0 = &self.0; // has `&&i32` type, not `&i32`
        write!(f, "--> {_0:p}") // so, prints address of the `_0` local binding, 
                                // not the one of the `self.0` field as we would expect
    }
}

Solution

Instead of introducing local bindings, try to replace this shortcuts (like _0) in idents and expressions with a field accessing syntax (like self.0). Or event just substitute _0 with (*_0), because for enum variants we cannot really access the fields via self..

Checklist

  • Documentation is updated (not required)
  • Tests are added/updated
  • CHANGELOG entry is added

@tyranron tyranron added this to the 1.0.0 milestone Dec 23, 2023
@tyranron tyranron self-assigned this Dec 23, 2023
@JelteF
Copy link
Owner

JelteF commented Feb 20, 2024

@tyranron are you still planinng on addressing this?

@JelteF
Copy link
Owner

JelteF commented Mar 11, 2024

@tyranron ping

@tyranron
Copy link
Collaborator Author

@JelteF yes, sorry for not being responsive. I was a little bit overwhelmed on my job last 2 months. I'll try to finish this in next 2 weeks or so.

@JelteF
Copy link
Owner

JelteF commented Mar 11, 2024

No worries, it happens. "Next 2 weeks or so" sounds amazing.

@tyranron tyranron mentioned this pull request Apr 4, 2024
@tyranron
Copy link
Collaborator Author

tyranron commented Apr 5, 2024

Oh... crap. This is a bit of a rabbit hole.

I've figured out the solution for cases like #[display("{_0:p}")] or #[display("{:p}", _0)]. But cannot figure out the solution for complex expressions like #[display("{}", format!("{_0:p}"))], so at the moment #[display("{_0:p}")] and #[display("{}", format!("{_0:p}"))] won't produce the same result, and I cannot get how to solve it without going into deep parsing of expressions, which is non-trivial and will require full feature of syn.

We basically hit the difference between accessing a binding and a self.field Rust semantics. We want a binding to always behave like a self.field, but that's not possible in Rust without moving out. And replacing a binding with a *binding in arbitrary expressions seems non-trivial, if solvable in general case at all.

@JelteF I think that supporting #[display("{_0:p}")] and #[display("{:p}", _0)] cases is enough. For more complex situations like #[display("{}", format!("{_0:p}"))] it's better to describe field accessing semantics better in docs and warn explicitly about possible pointer arithmetic or formatting implications.

@tyranron
Copy link
Collaborator Author

tyranron commented Apr 8, 2024

@JelteF thinking about it more, I'm unsure whether we should fix #[display("{_0:p}")]/#[display("{:p}", _0)] at all, since we cannot fix it in all the cases. Because having different behavior with #[display("{}", format!("{_0:p}"))] is just too subtle.

Instead, I propose to leave desugaring as it is now, and:

  1. Describe in docs clearly that _0 and field_name has pointer semantics, not value semantics, giving a clear example of how {:p} formatting should be used correctly.
  2. For trivial detectable cases like #[display("{_0:p}")]/#[display("{:p}", _0)] produce a compilation error, guiding towards correct usage: #[display("{:p}", *_0)].

This way, throwing the compile-time error, we bring the library user's attention to this edge case, so he will much more aware about the implications for the #[display("{}", format!("{_0:p}"))] case, rather than silently introducing a different behavior (being the wrong one) without warning the library user about this in any way.

@JelteF
Copy link
Owner

JelteF commented Apr 8, 2024

tl;dr I like your previous proposal much better.

Honestly, I cannot think of a reasonable case where anyone would do something like this: #[display("{}", format!("{_0:p}"))]. So I think optimizing for that case seems a bit strange. Because that means not being able to use #[display("{_0:p}")], which seems a much more common situation.

I think it makes sense to document the weirdness around this, but I think we should support #[display("{_0:p}")] and #[display("{:p}", _0)].

@yfcai yfcai mentioned this pull request May 8, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants