Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEP-2895: Query Parameter Filter #2959

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lianglli
Copy link

What type of PR is this?
/kind gep

What this PR does / why we need it:
The HTTPRouteFilter API now supports filters RequestHeaderModifier and ResponseHeaderModifier.
This GEP proposes adding support for modifying query parameters in a HTTPRoute.

Which issue(s) this PR fixes:
Fixes #2895

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/gep PRs related to Gateway Enhancement Proposal(GEP) labels Apr 11, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lianglli
Once this PR has been reviewed and has the lgtm label, please assign youngnick for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 11, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 11, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @lianglli. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@robscott
Copy link
Member

Hey @lianglli, we're still very focused on getting v1.1 out the door so won't have time for a thorough review until after that's released. With that said, I'd recommend adding a section that shows which underlying implementations would support this. As far as I can tell, NGINX would and Envoy would not. It would be worth confirming that + looking at other common dataplanes like HAProxy to understand how widely implementable an API like this would be.

@costinm
Copy link

costinm commented May 6, 2024

I was wondering: do we have WG participants who work on or have expertise in Nginx, HAProxy, Apache, caddy, traefik and the other data plane implementations ? I know Envoy is over represented, so not a problem - but it would be worth having some wiki page with various data planes - and few names that can provide review/feedback.

And maybe a common checklist for each proposal - to formalize a bit more the process of making sure the features and
APIs we add are implementable.

@spacewander
Copy link
Contributor

I was wondering: do we have WG participants who work on or have expertise in Nginx, HAProxy, Apache, caddy, traefik and the other data plane implementations ? I know Envoy is over represented, so not a problem - but it would be worth having some wiki page with various data planes - and few names that can provide review/feedback.

And maybe a common checklist for each proposal - to formalize a bit more the process of making sure the features and APIs we add are implementable.

I would like to help in this WG. Both Envoy and Nginx are my specialties.

I was the core developer of Apache APISIX, which is a popular gateway built upon Nginx:
https://github.com/apache/apisix
https://github.com/apache/apisix/graphs/contributors.

Nowadays, I am working on Envoy-based gateway. I have contributed some code to Envoy: https://github.com/envoyproxy/envoy/commits?author=spacewander, especially on the Golang filter: https://github.com/envoyproxy/envoy/blob/d1e95536aec14330a4235ad888246868d686cfac/CODEOWNERS#L383

Copy link
Member

@LiorLieberman LiorLieberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lianglli, we're still very focused on getting v1.1 out the door so won't have time for a thorough review until after that's released. With that said, I'd recommend adding a section that shows which underlying implementations would support this. As far as I can tell, NGINX would and Envoy would not. It would be worth confirming that + looking at other common dataplanes like HAProxy to understand how widely implementable an API like this would be.

AFAICT it is possible with lua filter in envoy (envoyproxy/envoy#2098) and there is a PR from 8 hours ago to support this on the Route level.

+1 to adding a section as Rob proposed though.

@lianglli
Copy link
Author

lianglli commented May 7, 2024

I was wondering: do we have WG participants who work on or have expertise in Nginx, HAProxy, Apache, caddy, traefik and the other data plane implementations ? I know Envoy is over represented, so not a problem - but it would be worth having some wiki page with various data planes - and few names that can provide review/feedback.

And maybe a common checklist for each proposal - to formalize a bit more the process of making sure the features and APIs we add are implementable.

Pls. check the "## Prior Art" and "## References" of this PR specifically.
Moreover, the HTTPQueryParamFilter is considered an extended feature.

BTW, I'm the core developer of Tengine and Tengine-Ingress.

@lianglli
Copy link
Author

lianglli commented May 7, 2024

Hey @lianglli, we're still very focused on getting v1.1 out the door so won't have time for a thorough review until after that's released. With that said, I'd recommend adding a section that shows which underlying implementations would support this. As far as I can tell, NGINX would and Envoy would not. It would be worth confirming that + looking at other common dataplanes like HAProxy to understand how widely implementable an API like this would be.

Currently, both the KONG and Traefik support Query Parameter Modification based on its plugin.
And, it is natively supported by Tengine-Ingress.

@mlavacca
Copy link
Member

mlavacca commented May 7, 2024

/cc @mlavacca

@k8s-ci-robot k8s-ci-robot requested a review from mlavacca May 7, 2024 15:52
Copy link
Member

@mlavacca mlavacca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, @lianglli!

Comment on lines 72 to 73
// Add adds the given query param(s) (name, value) to the HTTP request
// before the action.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to the other comment I left above: what happens when the query Param is already set with another value? I guess it's just a no-op

Copy link
Author

@lianglli lianglli May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Set action of HTTPQueryParamFilter is used to modify existing query params with the same name of an HTTP request.

// Input:
//   GET /foo?my-parameter=foo HTTP/1.1
//
// Config:
//   set:
//   - name: "my-parameter"
//     value: "bar"
//
// Output:
//   GET /foo?my-parameter=bar HTTP/1.1

The PR has the above comments specifically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is same as the Set action of HTTPHeaderFilter.

type HTTPHeaderFilter struct {
    // Set overwrites the request with the given header (name, value)
    // before the action.
    //
    // Input:
    //   GET /foo HTTP/1.1
    //   my-header: foo
    //
    // Config:
    //   set:
    //   - name: "my-header"
    //     value: "bar"
    //
    // Output:
    //   GET /foo HTTP/1.1
    //   my-header: bar
    //
    // +optional
    // +listType=map
    // +listMapKey=name
    // +kubebuilder:validation:MaxItems=16
    Set []HTTPHeader `json:"set,omitempty"`

Comment on lines +59 to +60
// - name: "my-parameter"
// value: "bar"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the header is not set in the request? Does it behave like an add or is it a no-op? I think it is worth mentioning the result of this corner case in the comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This GEP proposes to add a new field HTTPQueryParamFilter to HTTPRouteFilter.

For the header is not set in the request, the Set operation of HTTPHeaderFilter does NOT describe it specifically.
However, the Set is same as the modify. And, the Add is same as the add and append. Lastly, Remove is same as the delete.

Hence, if the query parameter of HTTPQueryParamFilter is not set in the request, gateway will NOT do anything.

The following example shows how a HTTPRoute modifies the query parameter of an HTTP request before it is sent to the upstream target.

It allows to add query parameter for only a certain canary backend, which can help in identifying certain users by the backend service.
Based on the following http rule, query parameter "passtoken=$sign_passtoken_plain" will be added to the requests be matched against the query parameter "gray=3", then the request will be routed to the canary service "http-route-canary:80".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Based on the following http rule, query parameter "passtoken=$sign_passtoken_plain" will be added to the requests be matched against the query parameter "gray=3", then the request will be routed to the canary service "http-route-canary:80".
Based on the following http rule, query parameter "passtoken=$sign_passtoken_plain" will be added to the requests to be matched against the query parameter "gray=3", then the request will be routed to the canary service "http-route-canary:80".


## Goals

* Provide a way to modify query parameters in a `HTTPRoute`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth mentioning here we are talking about transforming the request

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is.


```

## Prior Art
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Prior Art
## Implementation-specific solutions

I wouldn't use the "Prior art" term, as it is not a community-based prior art, but instead an implementation-specific effort.

- new-param=some-value
```

### Traefik supports this with a Query Paramter Modification plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Traefik supports this with a Query Paramter Modification plugin
### Traefik supports this with a Query Parameter Modification plugin


### Traefik supports this with a Query Paramter Modification plugin

* This Traefik plugin allows user to modify the query parameters of an incoming request, by either adding new, deleting or modifying existing query parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This Traefik plugin allows user to modify the query parameters of an incoming request, by either adding new, deleting or modifying existing query parameters.
* This Traefik plugin allows users to modify the query parameters of an incoming request, by either adding new, deleting or modifying existing query parameters.

@costinm
Copy link

costinm commented May 8, 2024

I was wondering: do we have WG participants who work on or have expertise in Nginx, HAProxy, Apache, caddy, traefik and the other data plane implementations ? I know Envoy is over represented, so not a problem - but it would be worth having some wiki page with various data planes - and few names that can provide review/feedback.
And maybe a common checklist for each proposal - to formalize a bit more the process of making sure the features and APIs we add are implementable.

Pls. check the "## Prior Art" and "## References" of this PR specifically. Moreover, the HTTPQueryParamFilter is considered an extended feature.

BTW, I'm the core developer of Tengine and Tengine-Ingress.

I'm not familiar with Tengine - is it based on Nginx ?

My question is more about having all "upstream" proxies that are used by different implementations ( nginx, envoy, all 'native' ones in rust/go/etc) - and info if it supports or not that feature.

Adding optional features that are only supported by a few implementations is possible (I personally don't think it's right, but it's what this WG has decided), but I think there is a significant cost on the users and portability of the configs, so at least we should have the info.

@lianglli
Copy link
Author

lianglli commented May 11, 2024

I was wondering: do we have WG participants who work on or have expertise in Nginx, HAProxy, Apache, caddy, traefik and the other data plane implementations ? I know Envoy is over represented, so not a problem - but it would be worth having some wiki page with various data planes - and few names that can provide review/feedback.
And maybe a common checklist for each proposal - to formalize a bit more the process of making sure the features and APIs we add are implementable.

Pls. check the "## Prior Art" and "## References" of this PR specifically. Moreover, the HTTPQueryParamFilter is considered an extended feature.
BTW, I'm the core developer of Tengine and Tengine-Ingress.

I'm not familiar with Tengine - is it based on Nginx ?

My question is more about having all "upstream" proxies that are used by different implementations ( nginx, envoy, all 'native' ones in rust/go/etc) - and info if it supports or not that feature.

Adding optional features that are only supported by a few implementations is possible (I personally don't think it's right, but it's what this WG has decided), but I think there is a significant cost on the users and portability of the configs, so at least we should have the info.

Yes, Tengine is based on nginx core with many advanced features (E.g., H3/QUIC , asynchronous SSL).

I got your concern.

However, the gateway-api is the next specification about L4 ~ L7 network routing in cloud-native.
This spec should cover all core elements in RFC.

The query parameters are an important part of the request URL.
The developers can use query parameters to filter, sort or customize data of request body. Backend service can enable different function based on the query parameters. Query parameters are important information about search and track.
Moreover, query parameter, headers and cookies are common techniques used in a canary release.

Just like modify header is useful, the same goes for query parameters.

At last, there are many requirements about 'Query Parameter Filter' in real internet world.

// +listType=map
// +listMapKey=name
// +kubebuilder:validation:MaxItems=16
Set []HTTPHeader `json:"set,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two differences between Header and Query string:

  1. Header is case insensitive while Query string is not.
  2. The value of Header can not be empty but the one in Query string can be empty.

Would be better to use a new type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'm checking it specifically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Header name is a case-insensitive field name rfc7230#section-3.2. Query name is compared in a case-sensitive manner rfc7230#section-2.7.3.

However, both the header value and query parameter value can be empty.

E.g., If the authority component is missing or undefined for the target URI, then a client MUST send a Host header field with an empty field-value rfc7230#section-5.4.

This GEP-2895 is based on the Gateway v1.0.0 and v1.1.0.

The type HTTPQueryParamMatch specifies how to match against the value of the query parameter in the HTTPRoute. The name of the type HTTPQueryParamMatch is HTTPHeaderName.

However, it's better to add a new type HTTPQuery for the type HTTPQueryParamFilter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacewander

Based on the Gateway v1.0.0 and v1.1.0, the type HeaderName is the name of a header or query parameter.

So, it's better to commit a new PR about it.
However, this GEP-2895 will add specific comments about name and value of query parameter.

// HeaderName is the name of a header or query parameter.
//
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=256
// +kubebuilder:validation:Pattern=`^[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$`
// +k8s:deepcopy-gen=false
type HeaderName string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see

}
```

## Examples
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will check them specifically.

@robscott
Copy link
Member

Hey @lianglli, we're working on scoping new features for v1.2, do you mind adding a comment to track support for this proposal in #3103?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/gep PRs related to Gateway Enhancement Proposal(GEP) needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants