keepalived segfault, keepalived runs regardless of protocol ? #390

bdowling · 2024-03-05T21:30:28Z

Describe the bug

We were experiencing constant segfaults in keepalived in our deployment, which was causing the MASTER to frequently bounce between nodes and services to be inconsistently reachable (frequent dropped connections).

Since there was no apparent cause, we decided to switch to layer2 mode, I switched the cluster to IPVS mode, and re-deployed openelb via helm chart. However I have a new problem;

On startup, it still deploys keepalived and when it does, all services go unreachable and even openelb-manager now has trouble reaching the k8s apiservice. There is a log entry that it is "cleaning ipvs configuration" which doesn't sound very good in a k8s IPVS environment.

A quick glance at the code looks to me like all the Speakers start up no matter the deployed config, but I'm only quickly reading the code.

I chose to kill the daemonset as I don't want it to segfault and cause problems with the LB services.

My question is why does keepalived daemonset get started if there are no eips with protocol "vip" ?

Other than that, services seem more reliable in Layer2 with the simple Speaker setup. the services work fine, the ARPs are being sent out, etc.

** Keepalived Startup Log **

I0305 21:20:04.413733       7 main.go:196] Creating API server client for https://10.66.0.1:443
I0305 21:20:04.414209       7 main.go:136] starting LVS configuration
I0305 21:20:05.519985       7 main.go:388] No interface was provided, proceeding with the node's default: eth0
I0305 21:20:05.522272       7 main.go:242] cleaning ipvs configuration
E0305 21:20:05.524371       7 reflector.go:126] github.com/aledbf/kube-keepalived-vip/pkg/controller/main.go:293: Failed to list
 *v1.Service: Get https://10.66.0.1:443/api/v1/services?limit=500&resourceVersion=0: write tcp 10.66.0.1:55816->100.66.0.1:443
: write: broken pipe

Output

[Tue Mar  5 18:52:54 2024] keepalived[2493485]: segfault at 8 ip 000055d3de676afc sp 00007ffcb05763b0 error 4 in keepalived[55d3de65a000+5d000]
[Tue Mar  5 18:52:54 2024] Code: 00 00 01 41 89 86 88 00 00 00 48 8b 4b 08 48 8b 55 08 48 c7 84 24 f0 00 00 00 00 00 00 00 48 c7 84 24 10 01 00 00 00 00 00 00 <48> 8b 41 08 4c 8b 59 10 48 8b 4a 08 48 8b 52 10 48 89 84 24 e0 00
[Tue Mar  5 18:53:10 2024] keepalived[2494237]: segfault at 8 ip 000055d3de676afc sp 00007ffcb0576460 error 4 in keepalived[55d3de65a000+5d000]
[Tue Mar  5 18:53:10 2024] Code: 00 00 01 41 89 86 88 00 00 00 48 8b 4b 08 48 8b 55 08 48 c7 84 24 f0 00 00 00 00 00 00 00 48 c7 84 24 10 01 00 00 00 00 00 00 <48> 8b 41 08 4c 8b 59 10 48 8b 4a 08 48 8b 52 10 48 89 84 24 e0 00
[Tue Mar  5 18:53:30 2024] keepalived[2494347]: segfault at 8 ip 000055d3de676afc sp 00007ffcb0576460 error 4 in keepalived[55d3de65a000+5d000]

Version Info

Version of Kubernetes: v1.28.4
Version of OpenELB: v0.5.0

The text was updated successfully, but these errors were encountered:

bdowling · 2024-03-06T01:22:27Z

As I dig a little deeper, I realize that the dependent container https://github.com/aledbf/kube-keepalived-vip is an archived project that hasn't been updated in 5 years. That's a lifetime in k8s ecosystem years.
v2.0.19 of keepalived (circa 2022; 2.2.8 is May 2023).

What are the plans around this? There should definitely be an option to not be forced to run keepalived-vip containers if they are not supported/maintained.

bdowling · 2024-03-06T01:50:45Z

Related #370 #285

renyunkang · 2024-03-06T02:23:47Z

We are still updating. In the future, a parameter will be used to control whether to start the speaker of the relevant protocol.

kube-keepalived-vip will also continue to be updated, please pay attention to https://github.com/openelb/kube-keepalived-vip

bdowling · 2024-03-08T18:41:28Z

fwiw, for anyone who desires not to have the keepalived pods running if you are not using VIP mode, I simply applied an empty nodeSelector to the openelb-keepalived-vip daemonset:

      nodeSelector:
        disabled: Manually-disabled-not-needed-for-layer2-mode

bdowling mentioned this issue Mar 8, 2024

kube-keepalived-vip - keepalived instance template access needed #393

Open

renyunkang added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keepalived segfault, keepalived runs regardless of protocol ? #390

keepalived segfault, keepalived runs regardless of protocol ? #390

bdowling commented Mar 5, 2024

bdowling commented Mar 6, 2024

bdowling commented Mar 6, 2024

renyunkang commented Mar 6, 2024

bdowling commented Mar 8, 2024

keepalived segfault, keepalived runs regardless of protocol ? #390

keepalived segfault, keepalived runs regardless of protocol ? #390

Comments

bdowling commented Mar 5, 2024

bdowling commented Mar 6, 2024

bdowling commented Mar 6, 2024

renyunkang commented Mar 6, 2024

bdowling commented Mar 8, 2024