Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ncm-nmstate: relax syntax of VLAN interface names #1679

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jouvin
Copy link
Contributor

@jouvin jouvin commented Apr 16, 2024

Fixes #1678

@jouvin jouvin requested review from jrha and aka7 April 16, 2024 09:08
@jouvin jouvin force-pushed the nmstate_relax_vlan_name_syntax branch from 122dd65 to 4f4c11c Compare April 16, 2024 09:24
@jouvin
Copy link
Contributor Author

jouvin commented Apr 16, 2024

Implementing a fix I uncovered something I consider as a flaw in the current implementation: to guess the VLAN ID, the interface name is took in priority to the device name. It is a problem if you remove the need for a . before the VLAN ID because there is a chance, like in vlan0 example that you don't get the right VLAN ID (in the example it should be 123, as the device is eth0.123). For some reasons, it seems the tests are passing but I'd suggest to take device first and the interface name if device doesn't provide the information. But it has the risk of being a backward incompatible change.... What's your feeling? Did I miss something?

@jouvin jouvin force-pushed the nmstate_relax_vlan_name_syntax branch from 4f4c11c to 8dfd32d Compare April 16, 2024 09:40
@aka7
Copy link
Contributor

aka7 commented Apr 16, 2024

When I did this for backward compatibility for network.pm, I followed this test file.
https://github.com/quattor/configuration-modules-core/blob/master/ncm-network/src/test/resources/route_rule.pan#L36

so I think long as your new change still covers this? then it should be fine for backward compatibility with network.pm?

Due to legacy code we had internally, we only define vlan interface using interfacename.vlan_id, i.e eth0.123 format as part of the interface name when defining vlan iface. So long as this test passes,
https://github.com/quattor/configuration-modules-core/blob/master/ncm-network/src/test/resources/nmstate_advance.pan#L30
we should be ok on our end.
so LGTM but I will test your PR against our profiles later this week to confirm.

regards

@jouvin
Copy link
Contributor Author

jouvin commented Apr 16, 2024

To be clear, my PR currently only relax the pattern matching on interface/device name but doesn't implement my second suggestion of inverting the order in which we use interface name and device to guess the VLAN ID. As for us this second point is not very important because we always use the same value for the interface name and the device. But I have the feeling you would allow the interface name not to convey the VLAN ID but have it attached to the device name instead...

Anyway, with my fix, I'm a bit surprised that vlan0 test is still working... I should look in more details. But this fix has been deployed at IJCLab and fixes the problem we had before as our VLAN interfaces are all named vlanxxx, xxx being the VLAN ID.

@aka7
Copy link
Contributor

aka7 commented Apr 16, 2024

@jouvin from what I see, vlan0 test config is working as expected with your change since you just relaxed it and not removed anything. vlan0 config is using device=eth0.123 to get the vlanid. (https://github.com/quattor/configuration-modules-core/blob/master/ncm-network/src/test/resources/nmstate_advance.pan#L40) so that is correct and I wouldn't have expected your change to break it? so IMO it is working as expected. Unless I'm misunderstanding the solution here.

I'm not fussed with order of check, as we are only setting it one way which is using the format "/system/network/interfaces/eth0.123"

so long as this format of config carries on working too, it should be fine.

I just did quick check on lab host and its a noop as expected.

@jouvin
Copy link
Contributor Author

jouvin commented Apr 16, 2024

@aka7 good! I'll see if I can come up with the reverse order giving the current result. For me it should...

My point with vlan0 is that with the relaxed regex, the ifnzme should match the regex and the vlan ID should be set to 0 which is not what we want. And for me the fix would be to reverse the order as the device will be carry the actual vlan0 ID and will be tested first. I guess it is working because ID 0 is causing the test to evaluate to false. My initial attempt to add vlan1 was failing in fact because it returned ID 1 instead of 123.

@jouvin jouvin force-pushed the nmstate_relax_vlan_name_syntax branch from 8dfd32d to 5b9837d Compare April 18, 2024 08:11
@jouvin
Copy link
Contributor Author

jouvin commented Apr 18, 2024

@aka7 @stdweird I'm trying to fix a problem I identified with the way returned VLAN ID from the interface/device name is tested. The fix is trivial but I'm trying to add a test for this and I'm having trouble because adding the needed lines in the test profile make all the tests failing (including the first one which is just compiling the profile if I'm right) with something that looks as a Pan error except that compiling the profile manually gives no error. Surely a trivial mistake of mine... Any idea?

@aka7
Copy link
Contributor

aka7 commented Apr 18, 2024

@jouvin there is planned meetup discussion today, are you able to join todays call, perhaps we can discuss it there?

@aka7
Copy link
Contributor

aka7 commented Apr 18, 2024

@aka7 good! I'll see if I can come up with the reverse order giving the current result. For me it should...

My point with vlan0 is that with the relaxed regex, the ifnzme should match the regex and the vlan ID should be set to 0 which > is not what we want. And for me the fix would be to reverse the order as the device will be carry the actual vlan0 ID and will be > tested first. I guess it is working because ID 0 is causing the test to evaluate to false. My initial attempt to add vlan1 was failing > in fact because it returned ID 1 instead of 123.

yes I just tried this on the test file, change interface name from vlan0 -> vlan1 and test fails.
May need to look at regex, because interface name with vlan0 is ignored but anything >0 is used as vlan id, which is not what we want on interface name? so I dont think you want to relax the regex for the interface name.

@jouvin
Copy link
Contributor Author

jouvin commented Apr 18, 2024

@aka7 sorry, I'm at a conference this week, it was impossible for me to join the meetup... Don't spend time trying to fix tests, I have almost everything done but currently I only published the side issue I identified and I don't understand why adding a VLAN interface in the test profile beaks the Pan compilation despite Pan being able to compile it successfully if run manually...

@jouvin
Copy link
Contributor Author

jouvin commented Apr 18, 2024

@stdweird any chance you could have a look why the second commit is breaking all tests... I'm sure it's trivial and you'll identify the problem very quickly! Just my lack of Perl/Quattor development these last years!

@jouvin
Copy link
Contributor Author

jouvin commented May 7, 2024

@stdweird I'm still looking for some help to sort out my (probably stupid) mistake with the test additions, so that I can push the real fix related to the issue...

@stdweird
Copy link
Member

stdweird commented May 7, 2024

@jouvin sorry, i don't know where my head is last weeks ;)
can you already rebase the PR?

@stdweird
Copy link
Member

stdweird commented May 7, 2024

@jouvin quick look at the code, but $iface->{device} || $name is not the same as $iface->{device}. i guess you probably want to pass the $name by itself as 3rd argument and also use it to guess the vlanid

@jouvin
Copy link
Contributor Author

jouvin commented May 7, 2024

@stdweird don't worry, I'm totally overwhelmed too these days, tackling to many issues a the same time... My question was not really on the code itself (the line you mentioned is not from me and I was not completely convinced if was correct with Perl, look at a bashism...) but anyway my main problem at this stage is that I added one test in one of the .t file and this breaks all tests. It has nothing to do with the module code itself: just removing the added lines in the test is enough to make all the other tests successful. I guess I made a very stupid mistake in my test additions.

As for your request to rebase, I'd prefer not to do it at this stage (as I have the whole fix ready for this version and would prefer to rebase afterwards), except if you have particular reason for asking for it...

@stdweird
Copy link
Member

stdweird commented May 7, 2024

hmmm, that is not the reason. if you can rebase, i can run the tests locally and debug it quicker

@jouvin
Copy link
Contributor Author

jouvin commented May 7, 2024

Sorry I missed the conflicts... I'll do the rebase then...

jouvin added 2 commits May 7, 2024 17:10
- Was the value from the profile instead of the value after applying default
@jouvin jouvin force-pushed the nmstate_relax_vlan_name_syntax branch from 5b9837d to 8dbf700 Compare May 7, 2024 15:13
@jouvin
Copy link
Contributor Author

jouvin commented May 7, 2024

@stdweird rebase done

@stdweird
Copy link
Member

stdweird commented May 7, 2024

@jouvin hidden in the output is a

[ERROR] Filename /etc/nmstate/vlan.0.yml found that doesn't match the device regexp. Must be an error in ncm-network.

@stdweird
Copy link
Member

stdweird commented May 7, 2024

@jouvin the pan unittest is an actual example? what is the generated filename for that device? probably the generated yml filename also needs to follow the $device logic you modified.

@jouvin
Copy link
Contributor Author

jouvin commented May 7, 2024

@stdweird yes sure, I need to fix the test for the modified logic but wanted to fix the test logic first... I apologise missing the error, I thought it looked every line... My bad! I'll check in the next days and try to complete this PR.

@stdweird
Copy link
Member

stdweird commented May 8, 2024

if you want to allow such names, just replace the whole device regex with \w+; or implement logic to skip the test (maybe controlled by boolean from the schema. you can have it default to true (ie allow arbitrary names by default)

@jouvin
Copy link
Contributor Author

jouvin commented May 8, 2024

@stdweird yes it is what I have done in fact but I have not pushed it yet.. I wanted to add a test first demonstrated the problem with the vlan0 example which works because of a bug in the configuration module...

For my information, where have you seen this "hidden error"?

@stdweird
Copy link
Member

it's not hidden, it's just a single line of ERROR output; it's easily missed

@jouvin
Copy link
Contributor Author

jouvin commented May 13, 2024

@stdweird I think I understood why I was not seeing the error you mentioned. You need -Dprove.args to see the error message from the configuration module (else I don't see any error related to the missing file) and the error I get is:

[ERROR] Filename /etc/nmstate/vlan.0.yml found that doesn't match the device regexp. Must be an error in ncm-network.

It seems that the component doesn't like the device name vlan.0 because there is no number after vlan and before the .. If using vlan0.0, the problem disappears. The change made seems to cause a problem with the ib1.12345 test but it is something else and doesn't break all the other tests that receive undef instead of the expected yml file contents...

I'll check if there is a good reason for such a restriction...

@jouvin
Copy link
Contributor Author

jouvin commented May 13, 2024

And it is unexpected for me. This error comes from ncm-network and I thought it was not used at all when using ncm-nmstate. I may have missed something...

@stdweird
Copy link
Member

@jouvin
Copy link
Contributor Author

jouvin commented May 15, 2024

@stdweird I found it but is it expected that we execute some ncm-network code when using ncm-nmstate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

ncm-nmstate: VLAN name pattern too restrictive
3 participants