Replies: 6 comments 1 reply
-
I like the model but this change would have a huge impact on every report, query etc. because node_ip will now have multiple matches for joins for stuff that was a single row before, wouldn't it? Even when including active, since an arp entry can be in many routers at a time? I'm open to it but a less impactful/easier to judge for breakage could be to refactor the PR like so:
|
Beta Was this translation helpful? Give feedback.
-
Actually what you describe is a patch I have running in a customized datacenter installation since 2016: rc9000/nd2_private_patches@f0b7bc8 (minus the vrf bit) It works well, but I made it a separate table which is very inefficient. But I suspected various breakage from such a cardinality change and was not brave enough to just replace node_ip :) |
Beta Was this translation helpful? Give feedback.
-
Thanks for the feedback @rc9000! Let me make an effort to show why I think it won't blow up (mostly) for existing users... but as you have a lot more extension/integration experience I take your word for it... (TL;DR: searching for a node_ip will get only ONE active row for any Consider the L2 case today where a node is in multiple switch mac-address tables due to MLAG or ESI (EVPN). We can also say there is no VLAN information being provided, causing VLAN=0. In this situation Netdisco will use a "last device wins globally" when updating node table, so the recent macsuck switch gets the active=true entry for Moving to L3, a node_ip is in multiple router ARP tables. As there's no VRF support, we have VRF=0 all the time. In this situation Netdisco can use "last device wins globally" when updating node_ip table, so the recent arpnip router gets the active=true entry for This means searching for a node_ip gets one active row for any For active=false, yes, there can be multiple rows for I can imagine a query to see the history of MAC addresses that an IP has had, and some rows will appear more than once, so DISTINCT ON Perhaps it comes down to whether we ever want "breaking" changes in Netdisco 2 DB schema, which this potentially is for active=false node_ip entries. Are people doing such queries (or queries without active at all)? |
Beta Was this translation helpful? Give feedback.
-
Yes that's my main concern, and I was able to find a bunch quickly in repos where I have given netdisco SQL access to people, but also in the wild - see attached file. And all these queries will start to work differently, in places where we have no control over or don't even know. I think one common misconception is that you have to line up the time_first/last in node/node_ip to "find the ip of the pc", and I had it myself when I started using Netdisco SQL, as in:
so I imagine code like this is just all over the place. So for the current feature, I'd really either use a separate table that can be joined in case of "i know what I'm doing", or as denormalized columns in node_ip. For the long term outlook, I'd love to have the arp entries keyed to vrf and l3 device, but I'd actually go a step further and abandon time_first/last for a schema that stores every arpnip/macsuck result separately. This would be a lot better for forensics and the now popular question of "how many people are at the office at the same time on a busy day". Database size is much less an issue now than at the time the schema was conceived. |
Beta Was this translation helpful? Give feedback.
-
Also as a completely separate thought, isn't active much less meaningful then? Currently:
but with the router key, there is only one node_ip active but in fact many of them on many routers can be active in non-trivial networks. Not that I'd expect big issues here, it's more an aesthetic consideration. |
Beta Was this translation helpful? Give feedback.
-
OK OK got it, thank you for the patience. Let's go with "seen_on_router_first (jsonb), seen_on_router_last (jsonb)" :-) |
Beta Was this translation helpful? Give feedback.
-
In #1174 we made a good start to implement storing the origin of an arp entry (the MAC IP mapping) with some history. Unfortunately the use of custom_fields for this is rather cumbersome so I have been thinking about other options.
It occurred to me that for mac-address table entries we already have a similar feature. Consider:
The composite primary key of
{mac, switch, port, vlan}
allows the node to move around and usingactive
field we can show the current location.We could implement, therefore:
The composite primary key of
{mac, router, ip, vrf}
allows the node_ip to move around and usingactive
field we can show the current location. (the new fields would berouter
,vrf
,time_recent
-- all others exist today)The field
router
for node_ip is analogous to the fieldswitch
for node. The fieldip
for node_ip is analogous toport
for node. Thevrf
is analogous to thevlan
, especially as we use VLAN 0 for "not reported" so we can start there with VRF too.The logic used for updating node can be copied to updating node_ip. It's basically two ways to track MAC addresses, working very similarly, but one at L2 and one at L3.
What have I missed? ...
Beta Was this translation helpful? Give feedback.
All reactions