New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to Node distribution test (#489) #558
base: main
Are you sure you want to change the base?
Conversation
Also, like its written in the issue itself, we could think about providing some more scenarios for this test, especially with the newly added testing possibilities via files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea with the test data. I have some suggestions about this to make it easier to unit test, see the inline comments.
Apart from that: I think it makes more sense to remove external logging configuration and the --config
flag. It is not used anywhere and the short form -c
also conflicts with the short form of --os-cloud
in the IaaS scripts.
The logging config can be simplified like it is done here:
standards/Tests/iaas/flavor-naming/cli.py
Lines 128 to 130 in 5e9dfd5
if __name__ == '__main__': | |
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO) | |
cli(obj=Config()) |
Also, when I run without any arguments, the script fails.
$ ./kaas/k8s-node-distribution/k8s-node-distribution-check.py
The config file under ./config.yaml couldn't be found.
Traceback (most recent call last):
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 127, in initialize_config
with open(config.config_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: './config.yaml'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 131, in initialize_config
exit(1)
File "/usr/lib/python3.10/_sitebuiltins.py", line 26, in __call__
raise SystemExit(code)
SystemExit: 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 244, in <module>
return_code = asyncio.run(main(sys.argv[1:]))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 197, in main
config = initialize_config(parse_arguments(argv))
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 135, in initialize_config
setup_logging(config.logging)
File "/home/martinmorgenstern/workspace/scs-standards/Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py", line 112, in setup_logging
logging.config.dictConfig(config_log)
File "/usr/lib/python3.10/logging/config.py", line 811, in dictConfig
dictConfigClass(config).configure()
File "/usr/lib/python3.10/logging/config.py", line 375, in __init__
self.config = ConvertingDict(config)
TypeError: 'NoneType' object is not iterable
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard states that
At least one control plane instance MUST be run in each "failure zone"
I played around with the test-example.yaml
you provided in ipython
and created a scenario where all control plane nodes are in the same zone and region. In this case, the script would still exit with 0 and just print
WARNING: There seems to be no distribution across multiple regions or labels aren't set correctly across nodes.
WARNING: There seems to be no distribution across multiple zones or labels aren't set correctly across nodes.
INFO: The master nodes are distributed across 3 host-ids.
INFO: The worker nodes are distributed across 2 regions.
Shouldn't it produce an ERROR
? Or do we assume the physical host is the failure zone?
Furthermore, I would like to see more scenarios like the one in the yaml file, and have them tested automatically in the CI against the expected check outcome, with pytest
.
The config yaml template file can be removed.
The thing that was a bit problematic while writing the standard was that different CSPs probably have different ideas about their failure zone, especially having small CSPs in mind. So it could very well be possible that the physical hosts are dividing the cluster into failure zones. Personally I think thats perfectly reasonable, but if you have another opinion on that, we can discuss this further. (But that would probably also mean to change the standard again)
I'm gonna add the other files I added with some additional examples.
|
Thanks for the clarification, I agree 👍 |
Added some more example files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second review round:
As we discussed today in person, I've committed a pytest script that uses the scenarios in your YAML files and I added a few more (in my opinion) interesting scenarios.
One test currently fails (intentionally) because the conformance test issues a warning twice if the host-id
label is missing where I think it should only do that once (that's one of the new scenarios I added). What do you think?
Furthermore, I propose to replace all occurences of master
with control-plane
(or just control
), because this fits better with the current terminology in Kubernetes (master
is outdated).
Small typo, which is not part of the diff: in line 149, it should read "the label … doesn't …" instead of "… don't …".
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
Agree with that, that duplication shouldn't happen. I removed that in the coming commit. I also did a general update to test test script in order to enforce the standard requirement for labels to be provided (MUST). And I made a small change to your test to reflect this. |
Tests/kaas/k8s-node-distribution/k8s-node-distribution-check.py
Outdated
Show resolved
Hide resolved
- added a new input that enables testing the test by providing yaml files containing label information for different nodes - added "test-example.yaml" containing an example for such a test file - removed the internal config, since the normal config file is already provided and read in as a default Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
Some dates fixing some problems mentioned by @martinmo. Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
Adds additional test files and removes previously deprecated config (template) files. Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
Signed-off-by: Martin Morgenstern <martin.morgenstern@cloudandheat.com>
Signed-off-by: Martin Morgenstern <martin.morgenstern@cloudandheat.com>
Signed-off-by: Martin Morgenstern <martin.morgenstern@cloudandheat.com>
Some fixes and updates in order to be compliant with the testdata. Thanks to @martinmo. Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
Signed-off-by: Martin Morgenstern <martin.morgenstern@cloudandheat.com>
a2a432a
to
976a078
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor PEP-8 thing.
# infrastructure parts to smaller ones. So we first look if nodes are distributed | ||
# across regions, then zones and then hosts. If one of these requirements is fulfilled, | ||
# we don't need to check anymore, since a distribution was already detected. | ||
labels = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels = ( | |
LABELS = ( |
this is the way PEP-8 recommends it, I think
(of course, the rest of the script would have to be adjusted accordingly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to LABELS
class LabelException(BaseException): | ||
"""Exception raised if a label isn't set""" | ||
|
||
|
||
class Config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class now seems a bit superfluous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But it can stay in case the options grow again)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna keep that for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I would like to request two changes:
- rename the script under test so that it can be imported directly
- put the test data into a single yaml file, load that once, and hand out the sections using a fixture
Small pep-8 changes. Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
@mbuechse had some change requests, that are tackled with this commit. Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
c60f895 should have the requested changes. |
@cah-hbaum Thanks! Sorry for the long delay on my end. I think you need to update https://github.com/SovereignCloudStack/standards/blob/main/Tests/scs-compatible-kaas.yaml as well, no? Apart from that, I would prefer if the yaml file were loaded only once, which should be possible with pytest, if you use fixtures, or if you just do it on the module level. But it's not a deal breaker. |
Signed-off-by: Hannes Baum <hannes.baum@cloudandheat.com>
I updated the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ;)
check_nodes
function (thanks to @martinmo)