Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InternalError (failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io") #1540

Closed
MykolaBordakov opened this issue Jul 27, 2022 · 43 comments

Comments

@MykolaBordakov
Copy link

MykolaBordakov commented Jul 27, 2022

Good day. I create a new issue
As I expalaine in #1481, when i try to config IPAddressPool and L2Advertisement system give me an error. Ful text here:

Error from server (InternalError): error when creating "kuber-fiels/confi_metalLB.yaml": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": dial tcp 10.100.2.163:443: connect: connection refused

My controller logs here:
controller.txt

API server logs:
API_logs.txt

@MykolaBordakov
Copy link
Author

@fedepaol Good day.
Sorry to bother you. Is any ideas about solving my problem? Did you check my controller`s log ?

@web-engineer
Copy link

web-engineer commented Jul 29, 2022

I've the same issue - using v13.

validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created
Error from server (InternalError): error when creating "metallb-manifest.yaml": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": context deadline exceeded
Error from server (InternalError): error when creating "metallb-manifest.yaml": Internal error occurred: failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s": context deadline exceeded
Error from server (InternalError): error when creating "metallb-manifest.yaml": Internal error occurred: failed calling webhook "l2advertisementvalidationwebhook.metallb.io": failed to call webhook: Post "https://webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-l2advertisement?timeout=10s": context deadline exceeded

Note the other issue (#1481) is using BGP i think, I'm attempting to configure Layer 2 in this instance.

I've taken the default yaml and appended my changes -

---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.100-192.168.1.110
status: {}
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: l2advertisement1
  namespace: metallb-system
spec:
  ipAddressPools:
  - default
status: {}
---

I've exhausted everything I can find online atm - going to try the previous version next.

@web-engineer
Copy link

Im my case It's definitely these CRD's failing to call the API endpoints - I'm not sure why and cant inspect the logs of the controller since its stuck in a reboot loop.

However - removing these, the controller hits a warning state, the speakers are fine ( dont forget to set your memberlist secret ).

I was under the assumption that the reason for the error in the controler was the missing CRD's but I think the issue is actually within the controller itself - since you can't set the CRD's if the controller isn't running!

Hope this helps point you where to look - I'm still looking.

In my instance I'm using default k0s with kube-router and trying to add metallb in the context of cloud on our own machines in-house running a set of KVM's for testing.

@web-engineer
Copy link

web-engineer commented Jul 29, 2022

I'm feeling a bit lost now... did notice that in the secrets there is a certificate definition but this appears to be missing a certificate... could this be the issue?

apiVersion: v1
kind: Secret
metadata:
  name: webhook-server-cert
  namespace: metallb-system
  uid: 4dc4976d-1bbd-4b80-84c2-4edfba23145e
  resourceVersion: '187773'
  creationTimestamp: '2022-07-29T11:37:36Z'
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"v1","kind":"Secret","metadata":{"annotations":{},"name":"webhook-server-cert","namespace":"metallb-system"}}
  managedFields:
    - manager: kubectl-client-side-apply
      operation: Update
      apiVersion: v1
      time: '2022-07-29T11:37:36Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:type: {}
  selfLink: /api/v1/namespaces/metallb-system/secrets/webhook-server-cert
type: Opaque
data: {}

This is generated when the manifest is applied initially, I'm moving onto something else for a bit - may try regressing to the previous release soon - any pointers appreciated.

EDIT...

There is a certificate in the ConfigMaps -

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-root-ca.crt
  namespace: metallb-system
  uid: b5977b69-416b-4f9a-aa81-a2f73ba39d63
  resourceVersion: '187729'
  creationTimestamp: '2022-07-29T11:37:34Z'
  annotations:
    kubernetes.io/description: >-
      Contains a CA bundle that can be used to verify the kube-apiserver when
      using internal endpoints such as the internal service IP or
      kubernetes.default.svc. No other usage is guaranteed across distributions
      of Kubernetes clusters.
  managedFields:
    - manager: kube-controller-manager
      operation: Update
      apiVersion: v1
      time: '2022-07-29T11:37:34Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:data:
          .: {}
          f:ca.crt: {}
        f:metadata:
          f:annotations:
            .: {}
            f:kubernetes.io/description: {}
  selfLink: /api/v1/namespaces/metallb-system/configmaps/kube-root-ca.crt
data:
  ca.crt: |
    -----BEGIN CERTIFICATE-----
    XXXXXXXXXXXXXXXXXXXXX
    -----END CERTIFICATE-----
binaryData: {}

However these names dont appear to match... if there is a move to CRD's from ConfigMaps is there a disconnect here?

@MykolaBordakov
Copy link
Author

Good day.
@web-engineer thank you for idea.
I checked ConfigMap and Secret. So, in my case ca.crt is different. Is it normal?

@web-engineer
Copy link

Making them match didn't work for me, i've torn down the install and starting fresh - this time with 0.12 but I'm now stuck with the controller restarting / stuck in pending state... I suspect this is a symptom of something I'm missing to trying to debug this presently as hoping whatever config is needed/missing will fix things or highlight what to fix in current version.

@fedepaol
Copy link
Member

@web-engineer can you confirm that the data field of webhook-server-cert is empty?
I think that's the key.
The way it works is, we have this component in the controller that keeps that secret up to date with the certificate signed via the cluster's CA, and used to patch the webhooks.
Another check to be done is to see if the clientConfig part of the validatingwebhookconfiguration contains the certificate, because it starts empty and then it gets filled.

@fedepaol
Copy link
Member

Also, despite the symptoms being the same, it may be a different issue. Can you set the logs of the controller to debug (-loglevel debug) and send them over?

@web-engineer
Copy link

I'm just going to move back to this latest release as regressing isn't helping... will try get back to where i was and get those logs for you...

@fedepaol
Copy link
Member

fedepaol commented Jul 29, 2022

In @MykolaBordakov 's logs I see

{"level":"info","ts":1658940727.201935,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
{"level":"info","ts":1658940727.2072318,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}
{"level":"info","ts":1658940727.2150514,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}
{"level":"info","ts":1658940727.227721,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
{"level":"info","ts":1658940727.2334685,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}
{"level":"info","ts":1658940727.2437708,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}
{"level":"info","ts":1658940727.2545674,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration","name":"metallb-webhook-configuration","gvk":"admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration"}
{"level":"info","ts":1658940727.2601233,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"addresspools.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}
{"level":"info","ts":1658940727.267845,"logger":"cert-rotation","msg":"Ensuring CA cert","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition","name":"bgppeers.metallb.io","gvk":"apiextensions.k8s.io/v1, Kind=CustomResourceDefinition"}

which means the controller did its job and patched the webhooks.

@web-engineer
Copy link

@web-engineer can you confirm that the data field of webhook-server-cert is empty? I think that's the key. The way it works is, we have this component in the controller that keeps that secret up to date with the certificate signed via the cluster's CA, and used to patch the webhooks. Another check to be done is to see if the clientConfig part of the validatingwebhookconfiguration contains the certificate, because it starts empty and then it gets filled.

The config value in the secrets webhook-server-cert has the value of data:{} as above.

I've just reverted the stack and I'm just trying to elevate the logging to debug...

@MykolaBordakov
Copy link
Author

@fedepaol You mean, that controller works well.
So, if controller ok, maybe problem is on ca-cert?

@fedepaol
Copy link
Member

can you also check the validatingwebhookconfiguration content?
k get validatingwebhookconfiguration -o yaml

@web-engineer
Copy link

➜  metallb kubectl get validatingwebhookconfiguration -o yaml
apiVersion: v1
items:
- apiVersion: admissionregistration.k8s.io/v1
  kind: ValidatingWebhookConfiguration
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"admissionregistration.k8s.io/v1","kind":"ValidatingWebhookConfiguration","metadata":{"annotations":{},"creationTimestamp":null,"name":"metallb-webhook-configuration"},"webhooks":[{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta2-bgppeer"}},"failurePolicy":"Fail","name":"bgppeersvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta2"],"operations":["CREATE","UPDATE"],"resources":["bgppeers"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-addresspool"}},"failurePolicy":"Fail","name":"addresspoolvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["CREATE","UPDATE"],"resources":["addresspools"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-bfdprofile"}},"failurePolicy":"Fail","name":"bfdprofilevalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["DELETE"],"resources":["bfdprofiles"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-bgpadvertisement"}},"failurePolicy":"Fail","name":"bgpadvertisementvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["CREATE","UPDATE"],"resources":["bgpadvertisements"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-community"}},"failurePolicy":"Fail","name":"communityvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["CREATE","UPDATE"],"resources":["communities"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-ipaddresspool"}},"failurePolicy":"Fail","name":"ipaddresspoolvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["CREATE","UPDATE"],"resources":["ipaddresspools"]}],"sideEffects":"None"},{"admissionReviewVersions":["v1"],"clientConfig":{"service":{"name":"webhook-service","namespace":"metallb-system","path":"/validate-metallb-io-v1beta1-l2advertisement"}},"failurePolicy":"Fail","name":"l2advertisementvalidationwebhook.metallb.io","rules":[{"apiGroups":["metallb.io"],"apiVersions":["v1beta1"],"operations":["CREATE","UPDATE"],"resources":["l2advertisements"]}],"sideEffects":"None"}]}
    creationTimestamp: "2022-07-29T16:06:01Z"
    generation: 1
    name: metallb-webhook-configuration
    resourceVersion: "7266"
    uid: 8f7d20ac-dc86-4332-8d72-b5804c446cef
  webhooks:
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta2-bgppeer
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: bgppeersvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta2
      operations:
      - CREATE
      - UPDATE
      resources:
      - bgppeers
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-addresspool
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: addresspoolvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - CREATE
      - UPDATE
      resources:
      - addresspools
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-bfdprofile
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: bfdprofilevalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - DELETE
      resources:
      - bfdprofiles
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-bgpadvertisement
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: bgpadvertisementvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - CREATE
      - UPDATE
      resources:
      - bgpadvertisements
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-community
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: communityvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - CREATE
      - UPDATE
      resources:
      - communities
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-ipaddresspool
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: ipaddresspoolvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - CREATE
      - UPDATE
      resources:
      - ipaddresspools
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
  - admissionReviewVersions:
    - v1
    clientConfig:
      service:
        name: webhook-service
        namespace: metallb-system
        path: /validate-metallb-io-v1beta1-l2advertisement
        port: 443
    failurePolicy: Fail
    matchPolicy: Equivalent
    name: l2advertisementvalidationwebhook.metallb.io
    namespaceSelector: {}
    objectSelector: {}
    rules:
    - apiGroups:
      - metallb.io
      apiVersions:
      - v1beta1
      operations:
      - CREATE
      - UPDATE
      resources:
      - l2advertisements
      scope: '*'
    sideEffects: None
    timeoutSeconds: 10
kind: List
metadata:
  resourceVersion: ""
➜  metallb 

Logs dont seem to get that far...

{"branch":"dev","caller":"main.go:141","commit":"dev","goversion":"gc / go1.18.3 / amd64","level":"info","msg":"MetalLB controller starting version 0.13.4 (commit dev, branch dev)","ts":"2022-07-29T16:13:35Z","version":"0.13.4"}
{"level":"error","ts":1659111218.5549269,"msg":"Failed to get API Group-Resources","error":"Get \"https://10.96.0.1:443/api?timeout=32s\": dial tcp 10.96.0.1:443: connect: no route to host","stacktrace":"sigs.k8s.io/controller-runtime/pkg/cluster.New\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/cluster/cluster.go:160\nsigs.k8s.io/controller-runtime/pkg/manager.New\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/manager/manager.go:322\ngo.universe.tf/metallb/internal/k8s.New\n\t/go/go.universe.tf/metallb/internal/k8s/k8s.go:128\nmain.main\n\t/go/go.universe.tf/metallb/main.go:193\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
{"level":"error","ts":1659111218.555622,"logger":"setup","msg":"unable to start manager","error":"Get \"https://10.96.0.1:443/api?timeout=32s\": dial tcp 10.96.0.1:443: connect: no route to host","stacktrace":"go.universe.tf/metallb/internal/k8s.New\n\t/go/go.universe.tf/metallb/internal/k8s/k8s.go:148\nmain.main\n\t/go/go.universe.tf/metallb/main.go:193\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}

@MykolaBordakov
Copy link
Author

@fedepaol I load validatingwebhookconfiguration content. Here it:
validation_control.txt

@web-engineer
Copy link

I think my issue maybe network presently, I'm going to look at the KVM setup and reset things and try again as I feel like I've gone backwards rather than forwards... @MykolaBordakov seems to have certs showing in that configuration content - thinking that this maybe an inter node communication issue on my setup so will check that over and try again.

@fedepaol
Copy link
Member

Yep, they seem to be two different issues. In @web-engineer it seems the controller is not able to reach the api server, which is preventing it to patch the webhook. The webhook issue is just a side effect of that.

@fedepaol
Copy link
Member

@MykolaBordakov just triple checking, because I am finishing ideas.
Do you get the error even after minutes you deployed metallb? Just checking if you got the error only during the bootstrap phase.

@web-engineer
Copy link

Yep, they seem to be two different issues. In @web-engineer it seems the controller is not able to reach the api server, which is preventing it to patch the webhook. The webhook issue is just a side effect of that.

Yep - after completely rebooting everything and trying a few options with the network config but ultimately reverting - upon restart everything has come up. So suspect in my case the problem was the connectivity between nodes preventing the services from starting up.

@MykolaBordakov
Copy link
Author

@fedepaol I got an error no matter in what time I apply setting for layer two.
So, let use try to fix my problem. If controller ok it means service works well. But, we can reach to it thru 443 port and some IP adress.
Does kubernetes have policy for webhooks services?
Maybe something wrong with setting of internal comunication in cluster...

@MykolaBordakov
Copy link
Author

Also, i fond than controller hase satrange IP:
controller-5bd9496b89-qtqkz 1/1 Running 0 24h 172.17.0.2 worker2envserv
speaker-7gnq2 1/1 Running 0 24h 10.209.10.95 worker2envserv
speaker-kgrp2 1/1 Running 0 24h 10.209.10.93 mssterenvserv
speaker-rb7vm 1/1 Running 0 24h 10.209.10.94 worker1envserv

It deployed on worker node but have IP 172.17.0.2

@B1ue-W01f
Copy link

If anyone is using ArgoCD with resource exclusions for ValidatingWebhookConfiguration, that may be connected to your issue. disabling the exclusions resolved my issue.

@Ariantrom
Copy link

I have exactly the same problem with webhooks. My kubernetes cluster uses wave. Metallb installed using the manifest. The IPAddressPool and L2Advertisement entities are not created. @MykolaBordakov Have you solved this problem?

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

Also, i fond than controller hase satrange IP: controller-5bd9496b89-qtqkz 1/1 Running 0 24h 172.17.0.2 worker2envserv speaker-7gnq2 1/1 Running 0 24h 10.209.10.95 worker2envserv speaker-kgrp2 1/1 Running 0 24h 10.209.10.93 mssterenvserv speaker-rb7vm 1/1 Running 0 24h 10.209.10.94 worker1envserv

It deployed on worker node but have IP 172.17.0.2

that's because it's on the pod subnet, while the speakers are running on the host's namespace

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

I have exactly the same problem with webhooks. My kubernetes cluster uses wave. Metallb installed using the manifest. The IPAddressPool and L2Advertisement entities are not created. @MykolaBordakov Have you solved this problem?

can you share the logs of the contorller pod?

@MykolaBordakov
Copy link
Author

@Ariantrom Not yet. I found that problem comes not only with MetalLb. Here link:
kubernetes/ingress-nginx#5401
Now i`m checking it...

@Ariantrom
Copy link

commands_logs.txt

@Ariantrom
Copy link

control2go@ugok-k3s-m01:~/metallb$ cat ipaddresspool.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: default
spec:
addresses:

  • 10.100.4.100-10.100.4.105
    autoAssign: true

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

To both: Can I get the output of

kubectl get svc -n metallb-system
kubectl get pods -n metallb-system

?

@Ariantrom
Copy link

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

Seems good. Can you also paste the logs of the apiserver (in case it tells more than @MykolaBordakov 's one)?

@Ariantrom
Copy link

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

Can you try to kill the controller and retry when the new one comes up?

@Ariantrom
Copy link

Yes, but I've already done that.
https://pastebin.com/smxURDHy

@fedepaol
Copy link
Member

fedepaol commented Aug 1, 2022

I need to dig more into this. If you are able to provide a reproducer with something like vagrant, I'd be happy to debug it
In the meanwhile, setting failurePolicy=Ignore should make the apiserver ignore the webhook failure.

@MykolaBordakov
Copy link
Author

One of working solution : find ValidatingWebhookConfiguration. Dell and rule for matalLb.
Original solution here:
https://programmerah.com/solved-kubernetes-ingress-srv-error-failed-calling-webhook-validate-nginx-ingress-kubernetes-io-51118/

It should works when you have error with : connection refused !!

@MykolaBordakov
Copy link
Author

But solution in my previous comment is bad practice !!
We should dig more deeper....

@MykolaBordakov
Copy link
Author

Good day. I found the solution.
So, for solve it i needed to create new cluster.
First of all, when i use:
kubeadm init --pod-network-cidr=10.244.0.0/16 i had an eror. That system can`t find --cri-socket. Problem fixed by adding
--cri-socket=unix:///var/run/cri-dockerd.sock .
When i create cluster, as explained above i had this issue. But note: connect: connection refused it means that host refuse connection. Also, it no matter with what webhook service o worked.

Actually, i had two solution:

  1. Delete rule ValidatingWebhookConfiguration for metallb-webhook-configuration;
  2. Update failurePolicy=Ignore for rule ValidatingWebhookConfiguration for metallb-webhook-configuration.
    Second one is batter than first one, nut both of them are not solve the cause of the problem.
    So, i waste much time to find good solution, and during it killed my cluster :)
    But for recreation it, i decide to use --cri-socket=unix:///var/run/containerd/containerd.sock
    After manipulation, i tried to set up metalLb againe. And now, my problem pass.

So, as i understood correctly this problem connect with docker and working with it IP pool. For full understanding needs find what difference between unix:///var/run/containerd/containerd.sock and unix:///var/run/cri-dockerd.sock
PS: I`m use kubeadm version 1.24.0.

@fedepaol
Copy link
Member

fedepaol commented Aug 2, 2022

So to make sure I understood, now the webhook is working fine without changing the failurePolicy right?

@MykolaBordakov
Copy link
Author

@fedepaol Yap. If use unix:///var/run/containerd/containerd.sock you don`t need to change failurePolicy.

@fedepaol
Copy link
Member

fedepaol commented Aug 2, 2022

Ok, so it leaves with @Ariantrom 's issue. Are you able to reproduce it in a virtualized environment?

@timgriffiths
Copy link

I just hit this today, although the issue I faced was the no_proxy config that was automatlicy picked up from my host didn't exclude the internal k8s traffic ie ".svc" so all we requests were getting pushed out to our internal proxy rather than internal to the cluster

kubernetes/kubeadm#324

Hope that helps someone in the future

jlejeune added a commit to jlejeune/jlejeune.home-ansible that referenced this issue Aug 26, 2022
jlejeune added a commit to jlejeune/jlejeune.home-ansible that referenced this issue Aug 26, 2022
jlejeune added a commit to jlejeune/jlejeune.home-ansible that referenced this issue Aug 26, 2022
@fedepaol
Copy link
Member

fedepaol commented Sep 7, 2022

Closing in favor of #1597, please provide the information requested there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants