Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contributing named capture groups to pathToRegexp #305

Open
martixy opened this issue May 11, 2024 · 9 comments
Open

Contributing named capture groups to pathToRegexp #305

martixy opened this issue May 11, 2024 · 9 comments

Comments

@martixy
Copy link

martixy commented May 11, 2024

Hello everyone!

I would like to add an option to the pathToRegexp to capture named groups. I searched for previous issues, as this is likely to have come up before - named groups seem a natural match to the already existing named parameters - and indeed there are many.

The question is: Would you still be open to accepting a PR to that effect as noted here: #240 (comment) or has sentiment changed?

@blakeembrey
Copy link
Member

blakeembrey commented May 11, 2024

I think the answer is no, it's not particularly something I'd want to maintain in the long run.

Is there a good pro argument as to why it's an improvement? Keep in mind named capture groups aren't supported everywhere either. I’d rather not maintain something “just because we can”.

@martixy
Copy link
Author

martixy commented May 12, 2024

Is support for old IE versions important for this project? And what kind of maintenance burden would this impose?
I looked around the code and it seems like it would be an easy change, but maybe I don't know the full picture.

Sorry if these are dumb questions, I'm new to this OSS thing.

@blakeembrey
Copy link
Member

Is support for old IE versions important for this project?

Not a large one, but unfortunately any breaking change would be a new major version. Behind a flag removes that concern though, since it'd be opt-in. This module is used in browsers so maximum browsers support is preferred.

And what kind of maintenance burden would this impose?

Mostly ensuring I keep it supported and documentation up-to-date. Since I maintain a large number of packages and have limited time, I prefer to avoid adding things that don't seem to add much value. In this case, I worry that at best it does nothing except add package weight (which is also a concern, re browsers) and at worst it ends up creating new issues or bugs (because match works better already, by decoding automatically).

So given these concerns, I like to know in an issue why it'd be a useful addition. Any extra code grows the package (bad for browsers, maintenance, and documentation). So sadly I avoid doing things nowadays just because I can.

Side note: Another place of concern would be supported characters for named capturing groups. I'm not familiar with what the character ranges are, but they might deviate from what this package supports. If not today, then in a new release. For example, /(?<0test>.*)/ throws an error but is currently a valid name. I'm guessing it follows JS variable name conventions or something, but that deviates a bit from this package.

@martixy
Copy link
Author

martixy commented May 13, 2024

Valid names:
Ah! I did not know there was such a restriction on group names. MDN doesn't mention that and key names for object props (where the name will go) do not carry this limitation.
Another consideration I thought of is how to handle duplicate names (which result in a similar syntax error). But this package already doesn't do anything special to handle duplicate parameter names (for example match will only give you the last one and discard previous matches into the void). Continuing with that philosophy in both cases seems reasonable to me. But I would add a note in the docs:

  • groups Use named parameters as named capture groups. Note: Duplicate capture group names and names starting with a digit are a syntax error in regular expressions.

Package weight:

I worry that at best it does nothing except add package weight (which is also a concern, re browsers) and at worst it ends up creating new issues or bugs (because match works better already, by decoding automatically).

Is even a little new code a major concern? Naively the whole "feature" would be like 5 lines of changed code.

Why:

I like to know in an issue why it'd be a useful addition.

TBH I have no specific use case. It's useful insofar named groups existing in regexes is useful. Because it seemed like a simple change, I thought it would be a good opportunity for a first try in contributing to an open source project.
On that note, thank you for taking the time to explain everything to me.

@martixy
Copy link
Author

martixy commented May 13, 2024

Ultimately my case would be:

  1. It's opt-in, so no chance of anything breaking.

  2. As part of the pathToRegexp function, the end result is expected to be a regex, so I would think letting people run into the native regex errors would be perfectly fine. In fact it would seem weird for this package to contort itself to hide these errors.

  3. I don't think concerns about supported characters changing in the future are warranted.

    I'm guessing it follows JS variable name conventions

    I think it just follows other regex engines (JavaScript is technically a different languange than regular expressions). JS's regex engine works the same as most others in that regard. I went exploring on regex101.com a bit, and apparently only Python and Golang are special children, where the named group syntax is slightly different. Oh, and Rust does have named groups, but not named backreferences. Named groups have been the same since PCRE1, including valid naming quirks (tho I only tested with characters allowed by pathToRegex).
    To my mind, I see this change adding effectively 0 maintenance because that would either mean JS's regex engine deviating from just about every other engine on the planet or those engines collectively deciding to change the syntax and hell freezing over seems more likely.

But correct me if any of my assumptions are wrong.

@blakeembrey
Copy link
Member

Is even a little new code a major concern? Naively the whole "feature" would be like 5 lines of changed code.

As a node-only package, not really. As a browser package, I do try to consider it. If I thought it'd be used often it's a no brainer, but so far only really one person has asked and match worked for them. For context, I do try and maintain a package size here:

"limit": "2 kB"
. I think we could trim the size of the package even more if I had time to think about how.

As a general reference, I would prefer to tunnel people away from the raw regex and into using match directly, so there's some bias there to ensuring people aren't rolling their own decoding and other functionality 😅

FWIW, I think you are completely correct with all this, I just haven't seen anyone needing the feature enough to need the extra lines of code. I understand that's unfortunate as a first contribution and I'm sorry about that. I do want to just say yes to you anyway but I think it'd be much better to spend your time on a different issue instead. If that's what you're looking for, I can think about a different feature or package that might make a good first contribution. Are you specifically interested in this package or any other particular type of feature you'd want to contribute?

@martixy
Copy link
Author

martixy commented May 13, 2024

Ah, that is unfortunate. As for what led me here, I happened to look at the code because it threw an error when I tried to name my own group and I looked around the issues for any reason why named groups were not in already in and I found that post where you said you're open to a PR and because it seemed/seems such an easy change, I decided it was a good first try.
Also, I thought it was cool that the code was so well structured, you needed like 1 line of extra new code to do it.

I can't really think of any other feature to help with - my use case for this package is really simple to begin with, so I haven't ran into anything special that needs doing.

@blakeembrey
Copy link
Member

I happened to look at the code because it threw an error when I tried to name my own group

Can you elaborate on this? What errored or what were you trying to do?

@martixy
Copy link
Author

martixy commented May 14, 2024

Ah, well, "named parameters" aren't really named in any way in the context of pathToRegexp results - all you get is numbered capture groups and it's up to you to count which one you need. TBH my initial expectation when first using this function was that "named parameters" refers to named capture groups.

So I kinda tried something funny - I tried to name what the docs call "Unnamed parameters" (the irony is delightful).
E.g. instead of
'/user/:userid'
I tried
'/user/(?<userid>.*?)'

And then you get the error Pattern cannot start with "?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants