Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.15 #476

Closed
dtolnay opened this issue Sep 2, 2018 · 19 comments
Closed

Release 0.15 #476

dtolnay opened this issue Sep 2, 2018 · 19 comments
Labels
Milestone

Comments

@dtolnay
Copy link
Owner

dtolnay commented Sep 2, 2018

Preview rustdoc: https://docs.rs/syn-next/0.15.0-rc4/syn/
Release notes work in progress:


This release introduces a new parsing API that enables procedural macros to easily report intelligent errors when provided syntactically invalid macro input.

This functionality is important because Rust 1.29.0, shipping next week, will be stabilizing support for defining functionlike!(...) procedural macros. Until now procedural macros have only been usable as custom derives, for which the compiler guarantees a syntactically valid input and thus error reporting is not the responsibility of the macro. Function-like macros a.k.a. "bang" macros are different in that the caller may pass any invalid crazy input and needs to be told by the macro where they messed it up.

See below a few examples of error messages produced by Syn with almost no effort or attention to error handling on the part of the macro author.

Parsing

With the transition to the new parsing API we drop the use of nom-style parser combinator macros in favor of plain Rust functions and control flow. Parsers should be easier to write, much easier to read, and emit reasonable error messages with very little effort.

Here is how we might implement parsing for a simplified representation of Rust structs.

/// A struct with named fields, like `struct S { a: u8, b: String }`
struct Struct {
    struct_token: Token![struct],
    ident: Ident,
    brace_token: token::Brace,
    fields: Punctuated<Field, Token![,]>,
}

impl Parse for Struct {
    fn parse(input: ParseStream) -> Result<Self> {
        let content;
        Ok(Struct {
            struct_token: input.parse()?,
            ident: input.parse()?,
            brace_token: braced!(content in input),
            fields: content.parse_terminated(Field::parse_named)?,
        })
    }
}

Another example, input that may be either a struct or an enum:

enum Item {
    Struct(Struct),
    Enum(Enum),
}

impl Parse for Item {
    fn parse(input: ParseStream) -> Result<Self> {
        let lookahead = input.lookahead1();
        if lookahead.peek(Token![struct]) {
            input.parse().map(Item::Struct)
        } else if lookahead.peek(Token![enum]) {
            input.parse().map(Item::Enum)
        } else {
            Err(lookahead.error())
        }
    }
}

This release introduces a parse_macro_input! macro that handles shuffling errors correctly back to the compiler inside of a procedural macro.

#[proc_macro]
pub fn my_macro(tokens: TokenStream) -> TokenStream {
    let input = parse_macro_input!(tokens as Item);

    /* `input` is of type Item, our data structure defined above */
}

Let's try feeding this macro some invalid input. Syn triggers a compiler error indicating the problematic token and information about what the parser would have expected in that position where possible.

error: expected `struct` or `enum`
 --> src/main.rs:7:11
  |
7 | my_macro!(uh-oh);
  |           ^^
error: expected `:`
 --> src/main.rs:7:31
  |
7 | my_macro!(struct S { a: u8, b String });
  |                               ^^^^^^
error: unexpected token
 --> src/main.rs:7:30
  |
7 | my_macro!(struct S { a: u8 } @);
  |                              ^

Compile time and performance

As always, much attention has been paid to compile time. Error reporting is nice but we don't want to spend an excessive amount of time compiling tons of error handling code.

The parsing component has been designed with compile time in mind and Syn 0.15 consistently compiles in around the same amount of time as the most recent point release of 0.14. This is despite adding support for a variety of new Rust syntax accepted through RFCs and eRFCs in recent months.

Parsing performance improves by 50% with the move to the new parsing API, as measured by time taking to parse every Rust source file in the rust-lang/rust repo and its submodules. That time is down from 10.6 seconds to 6.9 seconds on my machine.

Newly supported Rust syntax

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 2, 2018

@mystor @alexcrichton @SergioBenitez @sgrif @Arnavion @Eijebong @steveklabnik if you have some time to check it out and maybe upgrade some existing crates, let me know if you have feedback on the new design. I'll plan to cut the real release around Tuesday or Wednesday.

@Eijebong
Copy link
Contributor

Eijebong commented Sep 2, 2018

I'm quite busy with the hyper update in servo right now but we're already lagging behind anywyay.
I don't see anything related to doc comments which is the main reasons we've been unable to update

@Arnavion
Copy link
Contributor

Arnavion commented Sep 2, 2018

I only use syn for custom derives via syn::parse and not much changed there, so I have no feedback.

@mystor
Copy link
Collaborator

mystor commented Sep 2, 2018

I'll look into updating synstructure soon! Thanks for doing this.

@sgrif
Copy link
Contributor

sgrif commented Sep 2, 2018

Diesel has 2 errors:

error[E0277]: the trait bound `syn::FieldValue: syn::parse::Parse` is not satisfied
  --> diesel_derives/src/field.rs:82:9
   |
82 |         parse_quote!(#tokens)
   |         ^^^^^^^^^^^^^^^^^^^^^ the trait `syn::parse::Parse` is not implemented for `syn::FieldValue`
   |
   = note: required because of the requirements on the impl of `syn::parse_quote::ParseQuote` for `syn::FieldValue`
   = note: required by `syn::parse_quote::parse`
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

The other is from the change to Pound and is easy to fix.

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 2, 2018

Thanks Sean! I published 0.15.0-rc2 which puts back the impl for FieldValue. Sorry about that.

@sgrif
Copy link
Contributor

sgrif commented Sep 2, 2018

We now receive this error:

error: proc-macro derive panicked
   --> diesel/src/expression/functions/mod.rs:219:51
    |
219 |               #[derive(Debug, Clone, Copy, QueryId, DieselNumericOps)]
    |                                                     ^^^^^^^^^^^^^^^^
    |
   ::: diesel/src/expression/functions/aggregate_folding.rs:3:1
    |
3   | / sql_function! {
4   | |     /// Represents a SQL `SUM` function. This function can only take types which are
5   | |     /// Foldable.
6   | |     ///
...   |
21  | |     fn sum<ST: Foldable>(expr: ST) -> ST::Sum;
22  | | }
    | |_- in this macro invocation
    |
    = help: message: unexpected end of input, expected one of: `for`, parentheses, `fn`, `unsafe`, `extern`, identifier, `::`, `<`, square brackets, `*`, `&`, `!`, `impl`, `_`, lifetime

Not entirely sure where it's coming from, unlikely to be the parsing of the derive input (which looks like this)

            #[derive(Debug, Clone, Copy, QueryId, DieselNumericOps)]
            pub struct $fn_name<$($type_args,)* $($arg_name),*> {
                $(pub(in super) $arg_name: $arg_name,)*
                $(pub(in super) $type_args: ::std::marker::PhantomData<$type_args>,)*
            }

The derive in question is here:

https://github.com/diesel-rs/diesel/blob/98aec3392e8b31ddb25754fd6c17f2c639c5531f/diesel_derives/src/diesel_numeric_ops.rs

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 2, 2018

Looks like it was from generics.where_clause.get_or_insert(parse_quote!(where)) not accepting the end of input as end of the where-clause. Fixed in 38012de.

That line can be written as generics.make_where_clause().

@kureuil
Copy link
Contributor

kureuil commented Sep 2, 2018

I just finished porting an (albeit simple) procedural macro to syn-next@0.15.0-rc3, it was very straightforward. The only problem that I encountered was my use of custom_keyword!, and after a quick glance at the documentation I couldn't find any replacement. My initial idea was to define one type per keyword, implement syn::token::Token for each of them and peek them on a lookahead. However the syn::token::Token is sealed making this impossible. I ended up parsing an Ident and matching its string representation manually, but this doesn't feel in line with the new parsing API. Did I miss something in the documentation ? If not, is there any possible work-around @dtolnay ?

@sgrif
Copy link
Contributor

sgrif commented Sep 3, 2018 via email

@alexcrichton
Copy link
Collaborator

I'd like to test out the new release but it's a long weekend in the US so I probably won't get a chance to do this until Tuesday/Wednesday at the earliest

@mystor
Copy link
Collaborator

mystor commented Sep 4, 2018

synstructure is green. Had to rewrite one custom parser (from gen_impl) but that was it.

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 4, 2018

I don't have a good design for custom keywords. Currently the best workaround would be as demonstrated in this example code e.g. if you have a custom keyword called gen,

let gen_token: Ident = input.parse()?;
if gen_token != "gen" {
    return Err(Error::new(gen_token.span(), "expected `gen`"));
}

Ideally this would be something like

input.parse::<gen>()?;

// or

input.parse::<Token![""gen""]>()?;

// or

input.parse::<Keyword<gen>>()?;

with some macro to wire everything up.

custom_keyword!(gen);

There's a question of what exactly this macro emits and how to deal with visibility, since we wouldn't want something public by default because that makes it unusable in the root of a proc macro crate which cannot contain anything pub, but we want some way for the same custom keyword to be usable across modules. It could be:

custom_keyword!(pub gen);

Alternatively there's K.I.S.S. input.parse_keyword("gen") but that doesn't compose with anything else so we would also need input.peek_keyword("gen"), input.peek2_keyword("gen"), lookahead.peek_keyword("gen"), and if you want to store the resulting token in a struct then that would need to be as just Span or Ident rather than as a type that is unique to gen as with the built-in tokens.

I would love a PR or prototype if anyone wants to think about this.

@mystor
Copy link
Collaborator

mystor commented Sep 4, 2018

If we had some way to make parse work like peek (in that the "type" is passed as an argument) it could be possible to do something like input.parse(Token![#gen]) which would stringify the identifier and include it in the parameter value, allowing us to avoid the custom_keyword! call. We could probably do this with existing peek methods, but I don't see a great way to do it with parse.

Perhaps we could have a input.tparse method which works more like peek (and only takes types implementing Peek?)

@alexcrichton
Copy link
Collaborator

Ok got a chance to port over wasm-bindgen to the new syn! Some thoughts:

  • ToTokens for Error would be a nice-to-have or otherwise having Clone on Error to be able to use into_compile_error(). (or something similar maybe?)
  • The input.error("...") worked beautifully
  • The non-macro API is fantastic. I didn't use things like parenthesized! though which looked sort of funny, so far I've only touched the methods of ParseStream/ParseBuffer. The documentation for parenthesized! looked a little funny (but perhaps not moreso than nom macros!)
  • One mild concern I had was the performance of manufacturing an Error, which in theory happens on the "happy path" with recursive descent parsers. I didn't actually look at the implementation or profile anything though.

Overall this looks like a great improvement! The commit looks quite clean. I'm not sure if Parse for BindgenAttr is as ergonomic as can be though

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 4, 2018

Thanks Alex!

We can add Clone for Error. Instead of ToTokens, would it help if we change into_compile_error to to_compile_error and take self by ref?

Regarding Error performance, it isn't prominent in my profiling. Note that peek is not implemented in terms of parse and so an error is not constructed when a peek is false. If your grammar is like Rust and can be parsed by looking ahead just 1 or 2 tokens, then you do that with peek. If your parsing strategy consists of trying to parse the same tokens as multiple alternatives in sequence, the parsing will dominate the construction of the error at the end. If for some reason you need to parse multiple alternatives and the error is what makes it unacceptably slow, which I haven't seen, then there is step which gives full low-level control so you're in total control of the performance. (There might be a few peek impls that still call parse, those can be rewritten but again they weren't in my profiles.)

Instead of your AnyIdent try to use Ident::parse_any.

@alexcrichton
Copy link
Collaborator

@dtolnay I think to_compile_error is all I'd really need, I just needed a way to implement ToTokens for MyCustomError where inside of that was stored a syn::parse::Error. Something like Clone would have helped with a consuming into_compile_error, but to_compile_error lifts the need for Error.

For performance that sounds great to me, it sounds like the right defaults. I like the default of "use ? everywhere and write nice code" and then if you really need it more low-level tools are available to you.

Also parse_any looks perfect, I'll use that!

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 5, 2018

Update: I got busy, currently planning to land something for custom keywords as well as the changes based on Alex's feedback, then release morning PDT on Thursday.

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 6, 2018

I landed a design I like for custom keywords in #481, added Clone for Error, changed Error::into_compile_error(self) to Error::to_compile_error(&self), and published rc4.

@dtolnay dtolnay closed this as completed Sep 6, 2018
bors added a commit to rust-lang/rust that referenced this issue Sep 9, 2018
proc_macro::Group::span_open and span_close

Before this addition, every delimited group like `(`...`)` `[`...`]` `{`...`}` has only a single Span that covers the full source location from opening delimiter to closing delimiter. This makes it impossible for a procedural macro to trigger an error pointing to just the opening or closing delimiter. The Rust compiler does not seem to have the same limitation:

```rust
mod m {
    type T =
}
```

```console
error: expected type, found `}`
 --> src/main.rs:3:1
  |
3 | }
  | ^
```

On that same input, a procedural macro would be forced to trigger the error on the last token inside the block, on the entire block, or on the next token after the block, none of which is really what you want for an error like above.

This commit adds `group.span_open()` and `group.span_close()` which access the Span associated with just the opening delimiter and just the closing delimiter of the group. Relevant to Syn as we implement real error messages for when parsing fails in a procedural macro: dtolnay/syn#476.

```diff
  impl Group {
      fn span(&self) -> Span;
+     fn span_open(&self) -> Span;
+     fn span_close(&self) -> Span;
  }
```

Fixes #48187
r? @alexcrichton
@dtolnay dtolnay added this to the 0.15 milestone Aug 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants