Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After disabling ClusterMesh, Cilium seems to think it still exists despite it was disabled #32300

Closed
2 of 3 tasks
samip5 opened this issue May 2, 2024 · 3 comments · Fixed by cilium/cilium-cli#2544
Closed
2 of 3 tasks
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.

Comments

@samip5
Copy link

samip5 commented May 2, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I had used ClusterMesh for a moment, and after disabling it, Cilium's daemon health seems to suggest that it's enabled still? I have uninstalled it via Helm as well, but it made no difference.

Cilium Version

cilium-cli: v0.16.4 compiled with go1.22.1 on darwin/arm64
cilium image (default): v1.15.3
cilium image (stable): v1.15.4
cilium image (running): 1.15.4

Kernel Version

5.15.0-105-generic

Kubernetes Version

v1.28.2+k3s1

Regression

No response

Sysdump

cilium-sysdump-20240502-113512.zip

Relevant log output

$ ./k8s-cilium-exec.sh cilium-dbg status
==== detail from pod cilium-wqk57 , on node plex-server
KVStore:                Ok   Disabled
Kubernetes:             Ok   1.28 (v1.28.2+k3s1) [linux/amd64]
Kubernetes APIs:        ["EndpointSliceOrEndpoint", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumLocalRedirectPolicy", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "cilium/v2alpha1::CiliumCIDRGroup", "core/v1::Namespace", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement:   Strict   [vlan.10     192.168.2.129 2001:14ba:7475:4900::500 2001:14ba:7475:4900:a236:9fff:fe18:55fb fe80::a236:9fff:fe18:55fb (Direct Routing)]
Host firewall:          Disabled
SRv6:                   Disabled
CNI Chaining:           none
Cilium:                 Ok   1.15.4 (v1.15.4-9b3f9a8c)
NodeMonitor:            Listening for events on 8 CPUs with 64x4096 of shared memory
IPAM:                   IPv4: 38/254 allocated from 10.40.0.0/24, IPv6: 38/18446744073709551614 allocated from fd94:9bde:1ebb::/64
ClusterMesh:            0/1 clusters ready, 0 global-services
   nebula: not-ready, 0 nodes, 0 endpoints, 0 identities, 0 services, 0 failures (last: never)
   └  Waiting for initial connection to be established
   └  remote configuration: expected=unknown, retrieved=unknown
   └  synchronization status: nodes=false, endpoints=false, identities=false, services=false
IPv4 BIG TCP:        Disabled
IPv6 BIG TCP:        Disabled
BandwidthManager:    Disabled
Host Routing:        BPF
Masquerading:        BPF   [vlan.10]   10.40.0.0/16 [IPv4: Enabled, IPv6: Enabled]
Controller Status:   234/236 healthy
  Name                                  Last success   Last error   Count   Message
  endpoint-2485-regeneration-recovery   never          54s ago      230     regeneration recovery failed
  remote-etcd-nebula                    never          3m23s ago    57      timed out while waiting for etcd session. Ensure that etcd is running on [https://nebula.mesh.cilium.io:2379]
Proxy Status:            OK, ip 10.40.0.92, 0 redirects active on ports 10000-20000, Envoy: embedded
Global Identity Range:   min 196608, max 262143
Hubble:                  Ok         Current/Max Flows: 4095/4095 (100.00%), Flows/s: 129.40   Metrics: Ok
Encryption:              Disabled
Cluster health:                        Warning   cilium-health daemon unreachable
Modules Health:          Stopped(0) Degraded(1) OK(10) Unknown(3)

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct
@samip5 samip5 added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels May 2, 2024
@youngnick
Copy link
Contributor

Thanks for this issue @samip5, I can see there don't seem to be any directions for removing clustermesh on docs.cilium.io, so I'll mark this for the team's attention.

@youngnick youngnick added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/clustermesh Relates to multi-cluster routing functionality in Cilium. labels May 3, 2024
@samip5
Copy link
Author

samip5 commented May 3, 2024

Thanks for this issue @samip5, I can see there don't seem to be any directions for removing clustermesh on docs.cilium.io, so I'll mark this for the team's attention.

I did try the normal cilium clustermesh disable which I did find in the cli command help.

giorio94 added a commit to giorio94/cilium-cli that referenced this issue May 10, 2024
cilium/cilium#28763 decoupled the helm settings to enable the
clustermesh-apiserver and provide the list of clusters to connect to.
Let's reflect this change to the 'cilium clustermesh disable' command
as well, explicitly disabling and resetting the remote clusters configs
when invoked, to correctly disconnect from possible leftover clusters.

Fixes: cilium/cilium#32300
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94
Copy link
Member

Thanks @samip5 for the report. I've raised cilium/cilium-cli#2544 to fix the issue by explicitly disabling the clustermesh configuration and resetting the list of connected clusters when running cilium clustermesh disable.

michi-covalent pushed a commit to cilium/cilium-cli that referenced this issue May 15, 2024
cilium/cilium#28763 decoupled the helm settings to enable the
clustermesh-apiserver and provide the list of clusters to connect to.
Let's reflect this change to the 'cilium clustermesh disable' command
as well, explicitly disabling and resetting the remote clusters configs
when invoked, to correctly disconnect from possible leftover clusters.

Fixes: cilium/cilium#32300
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants