Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If NetworkManager is running connectivity-checks expect some suspicious looking logs #4480

Open
iamasmith opened this issue Apr 3, 2024 · 0 comments

Comments

@iamasmith
Copy link

iamasmith commented Apr 3, 2024

Summary

I found an initially worrying set of events being logged with various public ranges from calico-node - stock install with microk8s on 22.04 LTS.

They look like this..

2024-04-02 21:56:17.029 [WARNING][75] felix/route_table.go 1053: Failed to delete neighbor FDB entry {LinkIndex:26 Family:7 State:128 Type:0 Flags:2 FlagsExt:0 IP:185.125.190.18 HardwareAddr: LLIPAddr:<nil> Vlan:0 VNI:0 MasterIndex:0} error=invalid argument ifaceName="vxlan.calico" ifaceRegex="^vxlan.calico$" ipVersion=0x4 tableIndex=254

The messages are followed by 2 others when they happen so I'm including them for completeness (the order is out of my Loki/Grafana deployment so newest message first). Group of all 3 looks like this..

2024-04-02 21:56:17.029 [WARNING][75] felix/route_table.go 1203: Failed to access interface but it appears to be up error=netlink update operation failed ifaceName="vxlan.calico" ifaceRegex="^vxlan.calico$" ipVersion=0x4 link=&netlink.Vxlan{LinkAttrs:netlink.LinkAttrs{Index:26, MTU:1450, TxQLen:0, Name:"vxlan.calico", HardwareAddr:net.HardwareAddr{0x66, 0xe0, 0xc4, 0x1b, 0xe5, 0xdd}, Flags:0x13, RawFlags:0x11043, ParentIndex:0, MasterIndex:0, Namespace:interface {}(nil), Alias:"", Statistics:(*netlink.LinkStatistics)(0xc0009f3200), Promisc:0, Allmulti:0, Multi:1, Xdp:(*netlink.LinkXdp)(0xc000921008), EncapType:"ether", Protinfo:(*netlink.Protinfo)(nil), OperState:0x0, PhysSwitchID:0, NetNsID:-1, NumTxQueues:1, NumRxQueues:1, GSOMaxSize:0x10000, GSOMaxSegs:0xffff, GROMaxSize:0x10000, Vfs:[]netlink.VfInfo(nil), Group:0x0, Slave:netlink.LinkSlave(nil)}, VxlanId:4096, VtepDevIndex:2, SrcAddr:net.IP{<node private IP redacted>}, Group:net.IP(nil), TTL:0, TOS:0, Learning:false, Proxy:false, RSC:false, L2miss:false, L3miss:false, UDPCSum:true, UDP6ZeroCSumTx:false, UDP6ZeroCSumRx:false, NoAge:false, GBP:false, FlowBased:false, Age:300, Limit:0, Port:4789, PortLow:0, PortHigh:0} tableIndex=254
2024-04-02 21:56:17.029 [INFO][75] felix/route_table.go 1067: Removed old neighbor ARP entry ifaceName="vxlan.calico" ifaceRegex="^vxlan.calico$" ipVersion=0x4 neighbor=netlink.Neigh{LinkIndex:26, Family:2, State:32, Type:1, Flags:0, FlagsExt:0, IP:net.IP{0xb9, 0x7d, 0xbe, 0x12}, HardwareAddr:net.HardwareAddr(nil), LLIPAddr:net.IP(nil), Vlan:0, VNI:0, MasterIndex:0} tableIndex=254
2024-04-02 21:56:17.029 [WARNING][75] felix/route_table.go 1053: Failed to delete neighbor FDB entry {LinkIndex:26 Family:7 State:128 Type:0 Flags:2 FlagsExt:0 IP:185.125.190.18 HardwareAddr: LLIPAddr:<nil> Vlan:0 VNI:0 MasterIndex:0} error=invalid argument ifaceName="vxlan.calico" ifaceRegex="^vxlan.calico$" ipVersion=0x4 tableIndex=254

They were happening every 5 minutes or so and initially I was worried that something public had created a path to the Calico network (I just installed the stock thing with default CNI and haven't used Calico in any depth so need to do a bit more studying for my own piece of mind now to understand what it is and isn't capable of - I'm more familair with Cilium and what Istio can do in multi cluster environments).

Having left tcpdump looking for DNS traffic out of these nodes I picked up connectivity-check.ubuntu.com because the PTR records for the addresses in my logs were all over the Canonical zone but on checking the addresses for this record they all matched.

It is caused by NetworkManager connectivity check attempting to reach connectivity-check.ubuntu.com on each interface, it appears to confuse the hell out of calico and I suspect that the public addresses that match that host name might be ones in the connection table in CLOSE_WAIT state.

What Should Happen Instead?

A more informed analysis by somebody who has gone deeper into Calico and possibly some prominent note, warning or documentation if these checks are left active.

Reproduction Steps

  1. Stock install of Ubuntu LTS 22.04. (it may be significant that I built the main node initially with a Gnome desktop I'm not sure if this explcitly enables the checks or if they are on by default)
  2. Add microk8s
  3. Watch logs

Introspection Report

n/a

Can you suggest a fix?

Option to disable the checks when microk8s is installed from snap. These messages dissapear once that is disabled (I just made interval=0 as this was the most fully documented vs enabled which didn't specify yes/no/true/false variants - again docs could be improved, leaving the uri for future reference).

root@nuc1:~# cat /lib/NetworkManager/conf.d/20-connectivity-ubuntu.conf 
[connectivity]
interval=0
uri=http://connectivity-check.ubuntu.com./
root@nuc1:~# 

Are you interested in contributing with a fix?

Probably not needed and/or up to maintainers on direction that they may wish to take on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant