Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway: Combining HTTPS listener with TLS-termination and TLS listener with TLS-passthrough #6985

Open
vehagn opened this issue May 4, 2024 · 4 comments · May be fixed by #6986
Open

Gateway: Combining HTTPS listener with TLS-termination and TLS listener with TLS-passthrough #6985

vehagn opened this issue May 4, 2024 · 4 comments · May be fixed by #6986
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vehagn
Copy link

vehagn commented May 4, 2024

Describe the bug:

I'm trying to create a Gateway where I use both a HTTPS listener with a certificate provided by Cert-manager, and a TLS listener with TLS-passthrough.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.example.com"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox.example.com"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

When I add the TLS listener the Gateway becomes unresponsive for all HTTPRoutes and TLSRoutes connected to it.
The event log for the the Gateway states:

Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

Expected behaviour:

I expect the Gateway to work with both listeners.
Cert-manager should allow/ignore the TLS listener running in Passthrough mode.

Steps to reproduce the bug:

Create the above Gateway.

Anything else we need to know?:

Cilium 1.15.1 provides the GatewayClass.
I initially believed this to be a Cilium-issue, but with further investigation it looks to be an issue with Cert-manager.

A workaround is to create two Gateways, each with their own listener. Alternatively route the HTTPS listener Gateway Service through to the TLS listener Gateway to only expose one LoadBalancer IP.

Environment details::

  • Kubernetes version: 1.29.3
  • Cloud-provider/provisioner: Bare metal
  • cert-manager version: 1.14.4
  • Install method: Kustomize + Helm
# kustomization.yaml
helmCharts:
  - name: cert-manager
    repo: https://charts.jetstack.io
    version: 1.14.4
    includeCRDs: true
    releaseName: cert-manager
    namespace: cert-manager
    valuesFile: values.yaml
# values.yaml
installCRDs: true

config:
  apiVersion: controller.config.cert-manager.io/v1alpha1
  kind: ControllerConfiguration
  featureGates:
    ExperimentalGatewayAPISupport: true

/kind bug

@cert-manager-prow cert-manager-prow bot added the kind/bug Categorizes issue or PR as related to a bug. label May 4, 2024
@vehagn vehagn linked a pull request May 4, 2024 that will close this issue
@hawksight
Copy link
Member

Hey @vehagn thanks for raising. Where does this log come from?

Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

Does the TLS listener prevent the Certificate being created for the HTTPS listener in the single file example?

@vehagn
Copy link
Author

vehagn commented May 7, 2024

Thanks for picking up the issue @hawksight,

The log comes from the events when you run kubectl describe on the Gateway resource.

I’m unsure of what you mean by “single file example.” I think the certificate is successfully created for the HTTPS-listener, it could be from before adding the TLS-listener (I can double check later).

My main issue is that HTTPRoutes connected to the Gateway becomes unresponsive, meaning that I can’t access the Services behind them.

I tried to solve the issue in the linked PR, but haven’t built an image of the branch and tested it fully.

@hawksight
Copy link
Member

"single file example" I meant this YAML:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.example.com"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox.example.com"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

I was trying to understand if cert-manager was actually preventing the Gateway from working, or if that is a Gateway concern. I'll look at the PR more closely to understand the change.

Can you share how you have your Gateway installed? Is it via the standard YAML & CRDs or via a particular project that implements the GatewayAPI?

@vehagn
Copy link
Author

vehagn commented May 9, 2024

Thanks for the clarification @hawksight.

I've done some more testing which I try to explain in detail below.

The testing leads me to believe the connectivity issues might be linked to Cilium Issue #32371. Though I still think the BadConfig Warning message I've attempted to fix in PR #6986 for TLSRoutes in Passthrough mode is an improvement.


I'm fetching the Gateway CRDs from https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/experimental-install.yaml. I'm using the experimental install since I want to use the Gateway spec.infrastructure.annotations field to explicitly set the Gateway Service IP. I've omitted this field in the test Gateway.

The full configuration can be found at https://gitlab.com/vehagn/mini-homelab

Doing a new test I'm first creating a Gateway without the Cert-manager annotation

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test
  namespace: gateway
#  annotations:
#    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.stonegarden.dev"
      tls:
        certificateRefs:
          - kind: Secret
            name: test-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: TLS
      port: 443
      name: proxmox-tls-passthrough
      hostname: "proxmox-test.euclid.stonegarden.dev"
      tls:
        mode: Passthrough
      allowedRoutes:
        namespaces:
          from: All

and a TLSRoute

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
  name: test
  namespace: proxmox
spec:
  parentRefs:
    - name: test
      namespace: gateway
  hostnames:
    - "proxmox-test.euclid.stonegarden.dev"
  rules:
    - backendRefs:
        - name: proxmox-euclid
          port: 443

I'm now able to reach proxmox-test.euclid.stonegarden.dev through the Gateway.

Next I add the Cert-manager annotation (uncomment the above Gateway)

  annotations:
    cert-manager.io/issuer: cloudflare-issuer

Running kubectl describe on the Gateway I now get

❯ kubectl -n gateway describe gateway test
Name:         test
Namespace:    gateway
Labels:       argocd.argoproj.io/instance=gateway
Annotations:  argocd.argoproj.io/tracking-id: gateway:gateway.networking.k8s.io/Gateway:gateway/test
              cert-manager.io/issuer: cloudflare-issuer
API Version:  gateway.networking.k8s.io/v1
Kind:         Gateway
Metadata:
  Creation Timestamp:  2024-05-09T09:08:12Z
  Generation:          2
  Resource Version:    8865763
  UID:                 8437c2e5-b7e1-4d71-b5d5-15995fe4faa5
Spec:
  Gateway Class Name:  cilium
  Listeners:
    Allowed Routes:
      Namespaces:
        From:  All
    Hostname:  *.stonegarden.dev
    Name:      https-gateway
    Port:      443
    Protocol:  HTTPS
    Tls:
      Certificate Refs:
        Group:  
        Kind:   Secret
        Name:   test-cert
      Mode:     Terminate
    Allowed Routes:
      Namespaces:
        From:  All
    Hostname:  proxmox.euclid.stonegarden.dev
    Name:      proxmox-tls-passthrough
    Port:      443
    Protocol:  TLS
    Tls:
      Mode:  Passthrough
Status:
  Addresses:
    Type:   IPAddress
    Value:  192.168.1.221
  Conditions:
    Last Transition Time:  2024-05-09T09:13:15Z
    Message:               Gateway successfully scheduled
    Observed Generation:   2
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2024-05-09T09:13:15Z
    Message:               Gateway successfully reconciled
    Observed Generation:   2
    Reason:                Programmed
    Status:                True
    Type:                  Programmed
  Listeners:
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Programmed
      Observed Generation:   2
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Accepted
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Resolved Refs
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Name:                    https-gateway
    Supported Kinds:
      Group:          gateway.networking.k8s.io
      Kind:           HTTPRoute
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Programmed
      Observed Generation:   2
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Listener Accepted
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-05-09T22:00:24Z
      Message:               Resolved Refs
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Name:                    proxmox-tls-passthrough
    Supported Kinds:
      Group:  gateway.networking.k8s.io
      Kind:   TLSRoute
Events:
  Type     Reason             Age                From                       Message
  ----     ------             ----               ----                       -------
  Normal   CreateCertificate  66s                cert-manager-gateway-shim  Successfully created Certificate "test-cert"
  Warning  BadConfig          54s (x9 over 66s)  cert-manager-gateway-shim  Skipped a listener block: [spec.listeners[1].tls.certificateRef: Required value: listener has no certificateRefs, spec.listeners[1].tls.mode: Unsupported value: "Passthrough": supported values: "Terminate"]

I can still access proxmox-test.euclid.stonegarden.dev and I see that the certificate is created successfully.
Interestingly both listeners report one attached route.

The I comment out the Cert-manager annotation again

#  annotations:
#    cert-manager.io/issuer: cloudflare-issuer

and create a HTTPRoute for the Gateway.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: http-route
  namespace: whoami
spec:
  parentRefs:
    - name: test
      namespace: gateway
  hostnames:
    - "https-test.stonegarden.dev"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: whoami
          port: 80

I now get ERR_CONNECTION_RESET when trying to access https-test.stonegarden.dev. The TLSRoute endpoint proxmox-test.euclid.stonegarden.dev still works.

The HTTPRoute status indicates that it should work.

status:
  parents:
    - conditions:
        - lastTransitionTime: '2024-05-09T22:28:45Z'
          message: Accepted HTTPRoute
          observedGeneration: 2
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:45Z'
          message: Service reference is valid
          observedGeneration: 2
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      controllerName: io.cilium/gateway-controller
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: test
        namespace: gateway

and the Gateway report two routes attached to the HTTPS-listener

status:
  addresses:
    - type: IPAddress
      value: 192.168.1.221
  conditions:
    - lastTransitionTime: '2024-05-09T22:28:36Z'
      message: Gateway successfully scheduled
      observedGeneration: 11
      reason: Accepted
      status: 'True'
      type: Accepted
    - lastTransitionTime: '2024-05-09T22:28:36Z'
      message: Gateway successfully reconciled
      observedGeneration: 11
      reason: Programmed
      status: 'True'
      type: Programmed
  listeners:
    - attachedRoutes: 2
      conditions:
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Programmed
          observedGeneration: 11
          reason: Programmed
          status: 'True'
          type: Programmed
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Accepted
          observedGeneration: 11
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Resolved Refs
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      name: https-gateway
      supportedKinds:
        - group: gateway.networking.k8s.io
          kind: HTTPRoute
    - attachedRoutes: 1
      conditions:
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Programmed
          observedGeneration: 11
          reason: Programmed
          status: 'True'
          type: Programmed
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Listener Accepted
          observedGeneration: 11
          reason: Accepted
          status: 'True'
          type: Accepted
        - lastTransitionTime: '2024-05-09T22:28:36Z'
          message: Resolved Refs
          reason: ResolvedRefs
          status: 'True'
          type: ResolvedRefs
      name: proxmox-tls-passthrough
      supportedKinds:
        - group: gateway.networking.k8s.io
          kind: TLSRoute

Then I uncomment the Cert-manager annotation again

  annotations:
    cert-manager.io/issuer: cloudflare-issuer

And I can still connect to the TLSRoute endpoint, but not the HTTPRoute endpoint.

Deleting the Gateway and waiting for Argo CD to recreate it the TLSRoute endpoint also stops working.

Deleting the Gateway and waiting for Argo CD to recreate again it the TLSRoute endpoint now works again.

Deleting and recreating the gateway appears to continue this flip-flop pattern.

Cert-manager diligently reattaches the certificate it created earlier each time.


Edit:

Removing TLS-listener on Gateway: TLSRoute endpoint still responds, HTTPRoute doesn't.
Next deleting the TLSRoute: TLSRoute endpoint stops responding (endpoint presents the wildcard certificate), HTTPRoute endpoint finally starts working!

The TLSRoute appears to work without a HTTPS-listener (which is only supposed to accept HTTPRoutes) and "blocks" the HTTPRoute.

The commit-history of the above testing can be found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants