-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support colons in path for server-side #5676
Conversation
@@ -74,13 +74,13 @@ service HttpJsonTranscodingTestService { | |||
|
|||
rpc EchoTimestampAndDuration(EchoTimestampAndDurationRequest) returns (EchoTimestampAndDurationResponse) { | |||
option (google.api.http) = { | |||
get: "/v1/echo/{timestamp}/{duration}" | |||
get: "/v1/echo/timestamp/{timestamp}/{duration}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These paths can be a source of flakiness since it clashes with the following path:
get: "/v1/echo/response_body/value" |
get: "/v1/echo/response_body/repeated" |
}; | ||
} | ||
|
||
rpc EchoTimestamp(EchoTimestampRequest) returns (EchoTimestampResponse) { | ||
option (google.api.http) = { | ||
post: "/v1/echo/{timestamp}:get", | ||
post: "/v1/echo/timestamp/{timestamp}:get", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
post: "/v1/echo/{timestamp}:get", |
sortedEndpoints.stream().map(httpEndpoint -> httpEndpoint.spec().route().patternString()) | ||
sortedEndpoints.stream() | ||
.map(httpEndpoint -> httpEndpoint.spec().route()) | ||
.filter(route -> route.pathType() != RoutePathType.REGEX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is necessary because GrpcDocServicePluginTest.servicesTest
fails.
When the path type is regex, a question mark is included in the path. e.g. /(?<p0>[^/])/hello
However, the regex path is incorrectly cleansed in MethodInfo
.
final RequestTarget reqTarget = RequestTarget.forServer(path); |
Since this automatic behavior is done for gRPC only, I've added a logic to filter out this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we create the method info with the examplepath as is?
I'm a bit worried about the situation that all paths that have verb suffixes wouldn't have the example paths. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I guess I didn't think example paths were too important.
Having said this, I do think it's possible to replace regex patterns with the capturing group name at the cost of extra complexity. Pushed a commit to address this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks straightforward! Left small suggestions. 👍
core/src/main/java/com/linecorp/armeria/server/RouteBuilder.java
Outdated
Show resolved
Hide resolved
sortedEndpoints.stream().map(httpEndpoint -> httpEndpoint.spec().route().patternString()) | ||
sortedEndpoints.stream() | ||
.map(httpEndpoint -> httpEndpoint.spec().route()) | ||
.filter(route -> route.pathType() != RoutePathType.REGEX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we create the method info with the examplepath as is?
I'm a bit worried about the situation that all paths that have verb suffixes wouldn't have the example paths. 🤔
766d8c2
to
50d8dcc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 👍 👍 👍 👍
grpc/src/main/java/com/linecorp/armeria/internal/server/grpc/GrpcDocServicePlugin.java
Outdated
Show resolved
Hide resolved
grpc/src/main/java/com/linecorp/armeria/internal/server/grpc/GrpcDocServicePlugin.java
Outdated
Show resolved
Hide resolved
@@ -543,32 +541,12 @@ private static PathMapping getPathMapping(String pathPattern) { | |||
"pathPattern: " + pathPattern + | |||
" (not an absolute path starting with '/' or a unknown pattern type)"); | |||
} | |||
if (!pathPattern.contains("{") && !pathPattern.contains(":")) { | |||
if (!pathPattern.contains("/{") && !pathPattern.contains("/:")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to this change, /path/infix{foo}
becomes a valid pattern. Previously, an exception was raised.
{
is not an allowed character so it is percent-encoded by Armeria client or server. It means a request whose path is /path/infix{foo}
fails to match the service for /path/infix{foo}
and returns 404.
We may reject the pattern as before or treat this as a known problem if the percent-encoding issue is not relevant to this PR. I also don't see this mismatch as a critical problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ is not an allowed character so it is percent-encoded by Armeria client or server. It means a request whose path is /path/infix{foo} fails to match the service for /path/infix{foo} and returns 404.
I agree with this sentiment. However, I also thought that not only {}
but also other characters are valid path patterns but have the same issue. e.g. /$@
will have a similar issue.
Because of this, I thought 1) this was easy enough for maintainers to reason about 2) users won't be using this pattern anyways 3) If we want to limit these type of characters, it shouldn't be limited to only {}
but also other characters that should be precent-encoded.
I do think such strict validation will be nice to have for those who aren't familiar with percent-encoding rules though (I also have to refer to the rfc every time if I'm being honest). I think it might be nice to refer to other frameworks to see how strictly they are handling validation rules as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/path/infix{foo}
is not a common pattern for REST API. Let's resolve the problem when we supporting the use of a path variable with literal characters like https://github.com/spring-projects/spring-framework/blob/main/spring-core/src/test/java/org/springframework/util/AntPathMatcherTests.java#L659-L661
core/src/main/java/com/linecorp/armeria/server/ParameterizedPathMapping.java
Show resolved
Hide resolved
'(' + | ||
// If the segment doesn't start with ':' or '{', the behavior should be the same as ExactPathMapping | ||
"/[^:{][^/]*|" + | ||
"/:[^/{}]+|" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question)
- Asumming that a path pattern is
/a/:b:c/:d
,b:c
is used as a path variable, e.g.ctx.pathParam("b:c")
. Is that an expected behavior? /brace/{b}:def
is an invalid pattern./colon/:a:verb
is an valid pattern and a path segment is captured witha:verb
at which I was a bit suprised.
If we don't support /:param:verb
style in the ParameterizedPathMapping
, I prefer to raise an exception when a path variable includes :
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also found that the handling of {}
and :
is different which confused me at first.
Having said this, I decided to just follow the previous regex pattern since I didn't want to introduce inadvertent breaking changes.
Asumming that a path pattern is /a/:b:c/:d, b:c is used as a path variable, e.g. ctx.pathParam("b:c"). Is that an expected behavior?
/brace/{b}:def is an invalid pattern. /colon/:a:verb is an valid pattern and a path segment is captured with a:verb at which I was a bit suprised.
Double checked that this behavior is consistent with the main branch.
If we don't support /:param:verb style in the ParameterizedPathMapping, I prefer to raise an exception when a path variable includes :. What do you think?
I wanted to avoid introducing more breaking changes since this PR is already introducing certain breaking changes. I also agree that the assymmetry is confusing and it is easier to reason about if :
and {}
have the same treatment.
I guess this comes down to "does this issue annoy maintainers enough to make a breaking change (I'm not sure if users are actually using colon in names though)". I think we can discuss this further at the next stand up meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked that this behavior is consistent with the main branch.
I was confused. /brace/{b}:def
only works on the main branch but didn't work with the previous release version, 1.28.4. As it is consistent with the previous version, it is okay.
This PR is good as is. However, ultimately, it would be good to support a path pattern that allows the path variable can be located in a path segment. Some URL patterns seem useful:
/{project}/{repoName}.git
/download/{filename}.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, ultimately, it would be good to support a path pattern that allows the path variable can be located in a path segment
I agree this could be useful. I think it's probably doable if we start using {}
instead of \0
to internally representing the parameterized variables. Additionally, we probably need to modify the route parsing logic at Routers
as well.
I don't have immediate plans to handle this since RegexPathMapping
can kind of handle this case - I'd be happy to review if you would like to work on it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support complex path patterns to build finer grain routes, I am considering dropping :
from the path variable patterns in Armeria 2.0. We don't debate about that though.
I don't see it was a good choice that we used :
as a path variable. :
seems simple but it can be also used as a literal. If a path variable is mixed with a literal :
, it is ambiguous and difficult to know the exact name of the variable.
I don't have immediate plans to handle this since RegexPathMapping can kind of handle this case - I'd be happy to review if you would like to work on it though.
I'll try it next time. 😃
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5676 +/- ##
============================================
+ Coverage 74.05% 74.13% +0.07%
- Complexity 21253 21340 +87
============================================
Files 1850 1855 +5
Lines 78600 78827 +227
Branches 10032 10068 +36
============================================
+ Hits 58209 58436 +227
+ Misses 15686 15671 -15
- Partials 4705 4720 +15 ☔ View full report in Codecov by Sentry. |
Motivation: This bug was found while analyzing the code for #5676. Our gRPC-transcoding path parser contains logic which uses the parameter name when constructing the mapped path. While doing so, we use an arbitrary name prefixed with a 'p' if we are unable to determine a suitable parameter name. https://github.com/line/armeria/blob/1737482b82255d2ff4728927bdf91631bf1b23e9/grpc/src/main/java/com/linecorp/armeria/server/grpc/HttpJsonTranscodingPathParser.java#L385 This can be problematic since a path may be constructed in a different way from what the user expected. e.g. Assume we use the following path pattern: - path: `/v1/conflict/{id=hello/*}/{p0}/test` The above path will currently be mapped to: - `/v1/conflict/hello/:p0/:p0/test`. However, there is no guarantee that `id` and `p0` should be equal, and hence should be built to the following path: - `/v1/conflict/hello/:p0/:p1/test`. I propose that we prepend a '@' to the generated fields. The reasoning for this is: - Protobuf doesn't allow field names other than letters, digits, underscores in identifiers. - https://protobuf.dev/reference/protobuf/proto3-spec/#identifiers - https://protobuf.dev/reference/protobuf/proto2-spec/#identifiers - '@' is a valid character in a path segment, so it stays the same after `RequestTarget#forServer` is called. This is useful when displaying the paths in `DocService` (called by `MethodInfo`). - https://datatracker.ietf.org/doc/html/rfc3986#section-3.3 As long as the above is satisfied, I think any character is fine. I actually believe any `sub-delims` character can be used, so let me know if any character should be preferred more. > sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" Modifications: - Modified to prepend a '@' character in front of generated fields. Result: - There is no longer a bug when a user uses field `p{\d}` for gRPC transcoding. <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for handling this tricky problem, @jrhee17!
@jrhee17 👍 👍 👍 👍 👍 |
…e#5679) Motivation: This bug was found while analyzing the code for line#5676. Our gRPC-transcoding path parser contains logic which uses the parameter name when constructing the mapped path. While doing so, we use an arbitrary name prefixed with a 'p' if we are unable to determine a suitable parameter name. https://github.com/line/armeria/blob/1737482b82255d2ff4728927bdf91631bf1b23e9/grpc/src/main/java/com/linecorp/armeria/server/grpc/HttpJsonTranscodingPathParser.java#L385 This can be problematic since a path may be constructed in a different way from what the user expected. e.g. Assume we use the following path pattern: - path: `/v1/conflict/{id=hello/*}/{p0}/test` The above path will currently be mapped to: - `/v1/conflict/hello/:p0/:p0/test`. However, there is no guarantee that `id` and `p0` should be equal, and hence should be built to the following path: - `/v1/conflict/hello/:p0/:p1/test`. I propose that we prepend a '@' to the generated fields. The reasoning for this is: - Protobuf doesn't allow field names other than letters, digits, underscores in identifiers. - https://protobuf.dev/reference/protobuf/proto3-spec/#identifiers - https://protobuf.dev/reference/protobuf/proto2-spec/#identifiers - '@' is a valid character in a path segment, so it stays the same after `RequestTarget#forServer` is called. This is useful when displaying the paths in `DocService` (called by `MethodInfo`). - https://datatracker.ietf.org/doc/html/rfc3986#section-3.3 As long as the above is satisfied, I think any character is fine. I actually believe any `sub-delims` character can be used, so let me know if any character should be preferred more. > sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" Modifications: - Modified to prepend a '@' character in front of generated fields. Result: - There is no longer a bug when a user uses field `p{\d}` for gRPC transcoding. <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description -->
Motivation: Currently, gRPC verb suffix is supported by introducing a `VerbSuffixPathMapping`. There were several issues with this approach: - `VerbSuffixPathMapping` was applied inconsistently. - For instance, `RouteBuilder path(String pathPattern)` applies `VerbSuffixPathMapping`, but `RouteBuilder path(String prefix, String pathPattern)` doesn't. - In the context of `ExactPathMapping`, `VerbSuffixPathMapping` was acting as a workaround to support colons in the path. Additionally the support for colons is incomplete since only the last colon is correctly handled. - Another side-effect of this is route collisions are incorrectly reported since the verb suffix isn't taken into account. - Continued support of `VerbSuffixPathMapping` makes supporting colons natively difficult. - Misc) A path such ending with an escape character failed to start up the server: `/path\\` The original motivation for line#5613 was that the following patterns weren't supported. - /path/{param}:verb - /path/literal:verb I propose that the first case be handled via using `REGEX` `PathMapping` types, and the second case be handled by natively supporting colons in our path parameters. Note that colons are supported per the [rfc3986](https://datatracker.ietf.org/doc/html/rfc3986#section-3.3), and we already have an issue line#4577. 1. Supporting `/path/{param}:verb` I propose that we support this simply by introducing a new `PathMappingType.REGEX`. We can simply check if the last `PathSegment` is a `VerbPathSegment`. If it is a `VerbPathSegment`, then we can just use `PathMappingType.REGEX`. 2. Supporting `/path/literal:verb` Internally, we represent colons as parameterized variables both in `ParameterizedPathMapping` and `RoutingTrieBuilder`. This makes it difficult to support colons, so I propose that we modify the internal representation to a different character (`\0`). This character isn't a valid path character per [rfc3986](https://datatracker.ietf.org/doc/html/rfc3986#section-3.3), so I believe the impact is minimal. One side effect of this approach is that `ParameterizedPathMapping#paths` will return null character delimited paths. e.g. `/v0/:/path` -> `/v0/\0/path` Having said this, I believe the normal user path doesn't really use this value so it shouldn't matter. 3. Supporting colons in general When one calls `RouteBuilder#path(String)`, we determine whether to use `ParameterizedPathMapping` or `ExactPathMapping` depending on whether a colon is used. https://github.com/line/armeria/blob/8ab42847c146b481c72f000dbc1c4d77dc009220/core/src/main/java/com/linecorp/armeria/server/RouteBuilder.java#L546 - If the colon does not start the segment (e.g. `/a:b`), it is trivial to assume that the colon should be matched exactly. - If the colon does start the segment (e.g. `/:param`), it is ambiguous whether the user intended this to be a literal or parameter. For this case, I propose the following: - If a colon is used as-is, `/:param`, then the segment will be used to represent a parameter - If a colon is escaped, `/\\:param`, then the segment will be treated as a literal Optimization) `ExactPathMapping` is more performant than `ParameterizedPathMapping` because a simple equality check is done. I propose as an optimization we modify the condition to check if `/:` is contained instead of `:`. As a downside of this approach, the colon escape logic needs to also be added to `ExactPathMapping`. This can be undone if this logic seems too complicated though. Modifications: - When `VerbPathSegment` is used, use `RegexPathMapping` for gRPC transcoding. - Removed `VerbSuffixPathMapping` and related changes in `RoutingTrieBuilder`, `RouteBuilder`, and `ParameterizedPathMapping` - Modified `RoutingTrieBuilder` and `ParameterizedPathMapping` to use `\0` instead of ':' to represent parameterized path segments. This allows us to use ':' in most `PathMapping` types. - Relaxed `ParameterizedPathMapping` to act like `ExactPathMapping` when a path segment isn't capturing a path parameter. - Replace `/\\:` to `/:` in `ParameterizedPathMapping` and `ExactPathMapping` to give users an option to use the first colon as a literal. Result: - line#4577 is closed - Logic related to gRPC verb suffixes is generalized and cleaned up <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description -->
Motivation:
Currently, gRPC verb suffix is supported by introducing a
VerbSuffixPathMapping
.There were several issues with this approach:
VerbSuffixPathMapping
was applied inconsistently.RouteBuilder path(String pathPattern)
appliesVerbSuffixPathMapping
, butRouteBuilder path(String prefix, String pathPattern)
doesn't.ExactPathMapping
,VerbSuffixPathMapping
was acting as a workaround to support colons in the path. Additionally the support for colons is incomplete since only the last colon is correctly handled.VerbSuffixPathMapping
makes supporting colons natively difficult./path\\
The original motivation for #5613 was that the following patterns weren't supported.
I propose that the first case be handled via using
REGEX
PathMapping
types, and the second case be handled by natively supporting colons in our path parameters.Note that colons are supported per the rfc3986, and we already have an issue #4577.
/path/{param}:verb
I propose that we support this simply by introducing a new
PathMappingType.REGEX
. We can simply check if the lastPathSegment
is aVerbPathSegment
. If it is aVerbPathSegment
, then we can just usePathMappingType.REGEX
./path/literal:verb
Internally, we represent colons as parameterized variables both in
ParameterizedPathMapping
andRoutingTrieBuilder
. This makes it difficult to support colons, so I propose that we modify the internal representation to a different character (\0
). This character isn't a valid path character per rfc3986, so I believe the impact is minimal.One side effect of this approach is that
ParameterizedPathMapping#paths
will return null character delimited paths.e.g.
/v0/:/path
->/v0/\0/path
Having said this, I believe the normal user path doesn't really use this value so it shouldn't matter.
When one calls
RouteBuilder#path(String)
, we determine whether to useParameterizedPathMapping
orExactPathMapping
depending on whether a colon is used.armeria/core/src/main/java/com/linecorp/armeria/server/RouteBuilder.java
Line 546 in 8ab4284
/a:b
), it is trivial to assume that the colon should be matched exactly./:param
), it is ambiguous whether the user intended this to be a literal or parameter.For this case, I propose the following:
/:param
, then the segment will be used to represent a parameter/\\:param
, then the segment will be treated as a literalOptimization)
ExactPathMapping
is more performant thanParameterizedPathMapping
because a simple equality check is done. I propose as an optimization we modify the condition to check if/:
is contained instead of:
. As a downside of this approach, the colon escape logic needs to also be added toExactPathMapping
. This can be undone if this logic seems too complicated though.Modifications:
VerbPathSegment
is used, useRegexPathMapping
for gRPC transcoding.VerbSuffixPathMapping
and related changes inRoutingTrieBuilder
,RouteBuilder
, andParameterizedPathMapping
RoutingTrieBuilder
andParameterizedPathMapping
to use\0
instead of ':' to represent parameterized path segments. This allows us to use ':' in mostPathMapping
types.ParameterizedPathMapping
to act likeExactPathMapping
when a path segment isn't capturing a path parameter./\\:
to/:
inParameterizedPathMapping
andExactPathMapping
to give users an option to use the first colon as a literal.Result: