New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating test automation framework #774
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We added Dockerfiles to create containers for the daemons where the entrypoint process is not ran by root. This is only used for local runs of the test framework and not by the CI, so no changes on the containers built there. The purpose is only to be able to manually run the tests as non-root users in a linux host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nicvanorton and Leonardo @rodonile for this very useful and much wanted contribution. I started to have a look myself but wanted to ping Marc ( @msune ) as well as he is the original developer of the CI in the pmacct repo. Paolo
Hey @nicvanorton @rodonile, Sorry for the late response... busy days! First and foremost, this looks like a great step forward for pmacct's E2E testing. Thank you and congrats 🥳! It's quite inline with the E2E tests I had in mind, and complementary to some of the PRs I will issue in the upcoming week in that direction. This PR is quite big, so it will take me some time to review it. It would have been a good idea to start with a smaller PR, in which the base test framework, strategy and a single smoke test was contributed, so that we could have a more focused discussion. (If at all possible, I would highly encourage to rescope this PR in that direction; it will hugely simplify my review work and your rebase work if changes are required). There are a couple of things we can start to discuss right off the bat: 1. The need for root containersI would like to avoid having a copy of the You mentioned the reason #774 (review). Is there a chance you could add documentation in the 2. Tracking binary filesTracking binary files in I would suggest to look into two possible alternatives:
3.
|
Hi @msune, Thanks for your message and positive comments! The PR is big indeed, as it was developed during more than a year. Much of its size, though, is due to the many test cases that have been built around it and included therein, which may not need a thorough review (they check pmacct functionality itself and have been tested for months). I'd say the main need for review lies in the framework approach itself; this is why I think the effort of going through a new PR would not pay off (since the framework itself will still need to be included). 1 (Non-)root containers 2 Binary files 3 Output files As for running the framework locally, there's documentation about that in the README file. Let me know if something in there is not clear, or something does not run as expected/documented. Best, |
I would say, let's remove this part from this PR, and open another for it.
Yes, some repositories do that at a cost (not sold on that it's common practice; take a look at the hard limits of github for large files). For tcpdump for instance:
Tests essentially occupy 30% of the unpacked repository, and likely ~40-50% of the git db (binaries can't be that easily compressed).
That is actually my plan for another E2E suite. I don't want CI/CD to be downloading a lot of stuff it won't use. A submodule is cheap and guarantees consistency at the commit level. The repo could be under pmacct's org.
I am not sure I follow. I've worked in systems very similar to this to validate dataplanes (using pytest too, SW and HW packet generators etc..), in which indeed ordering of outputs was not guaranteed all the time. You typically add checks in order when possible, and then accumulate (execution) outputs and sort for sections that interlacing could happen, but still comparing to manually created synthetic outputs. Doing "fuzzy pattern matching" on outputs as per described seems a slippery slope to me. First, pattern matching in itself can have bugs. Second, if you take outputs from the same device under test (pmacct) that you are attempting to validate (let's say, from what you think is a valid run) - which I think you do -, you run a high risk of compensating bugs with your own "supposedly valid" observations, as both the expected output and the actual output come from executions of the DUT (pmacct). This is particularly easy go under the radar when test cases are modified or new test cases added (and hence outputs), because it requires a very careful human validation on the entire output to match. |
Concerning the repo and submodule I'm easy. No strong preference, so I'm leaving this to you, Paolo and Leonardo (Leonardo will be back from holiday in a couple of weeks). As for the output files, our approach is not very different than the one you described. We practically split the test into sections, each of which has its own corresponding output file. Assertions are done for each section in the designated order. In any case, it is up to the test author to use more detailed assertions if desired (...and even not use output files at all). It's an interesting point that using DUT's output creates a cyclic dependencies. TMHO this risk is minimal because the output files are supposed to be manually evaluated by the test author. And it is superseded by the fact that hard coded assertions cannot have an exhaustive coverage. Finally, it also boils down to the question whether one is primarily interested in detecting changes in DUT's behaviour (regression), or checking against a list of well-defined standards or requirements (functionality). |
Why do you think so? The fact that the output is synthetically generated (and not a result of an execution of the DUT), doesn't preclude to have thousands of assert conditions. I would counter-argue you can have a more exhaustive and especially controlled coverage. When lots of things need to be tested, that's where Can you point me to the worst case test that you think can't be checked against pytest asserts? I am fairly sure we can do that.
I am not sure I follow. They seem to me complementary (and overlapping) dimensions of any testing. |
Hello Marc, thanks for taking the time to review! Here is my take as well on some of the points addressed:
Given this, to address your concern we could move the non-root Dockerfiles back within the test-framework folder. I know that this is not really a solution, but anyway in the end we would only use them for local runs of the test. Or even extend the container builder script to enhance the root Dockerfiles with the 2 additional lines, which then removes the need to maintain the Dockerfiles in 2 different locations.I figured it could be interesting for somebody that prefers to run rootless containers, to have the Dockerfile available as well in the repo, just not being the default one used by the CI. For local runs of the tests we really need this, as otherwise it becomes a pain: manual clearing of the results folder would be required as the pmacct generated files are root owned.
About this point, we had a meeting with Paolo where we aligned on the integration procedure and decided to include all the code in the master branch. Leaving it in a separate repo was an option. Unfortunately you were not involved in the discussions until now, and we already spent time working towards integrating everything as you see here. The main reason we decided in the end to include all the tests upstream is that this allows for submitting a single PR for a new feature or bug fix with code + tests changes/new tests. Imagine that we add a new field in the default bmp peer_up schema, then within the PR we would also modify the output files by adding this new field where needed. With the tests in another repo, this becomes more difficult to maintain, as you need first a PR with the new tests in the test repo and then a PR in pmacct with the code + updated submodule ref commit. My personal opinion is that overall output files and pcap files are not that big, we don't even go near the limits stated by Github for large files even for tests with lots of messages. And in general it doesn't make sense to use oversized pcap files for test cases. Our biggest for example is like 260KB. As an idea, we could state a limit size on the pcap files that we accept to be merged. Anyway, I'm not strongly against the submodule approach. The main problem here is from my side is that this reorg would take more time than expected, which I don't really have in the following weeks (I don't know about Nicolas). I'm open to setup a call to discuss this further. And also, as I mentioned above, there is the added burden that this is harder to maintain.
Anyway, when it comes to output generation we are basically free to do it as we like. It is simpler to just take the output from pmacct, inspect it and declare it as "ok" and go with it. Synthetically generating such an output would be nicer I agree, but this means reproducing pmacct's logic to generate it (quite the effort I would say). Consider that we are aggregating flow records and correlating with BGP Rib information in some cases. Have a look at e.g. test 400. If your concerns were mostly related to the fuzzy pattern matching and substitution (e.g. ip addresses) that is sometimes performed, note that this is not mandatory and could be avoided entirely by providing output files with the ip addresses used by the framework already. We could also remove all fields that we ignore from the expected output files (as anyway they are removed from the json messages we receive from pmacct before they are compared). This way we avoid tracking e.g. timestamps that change on all runs, that might be a good idea. I'm not sure if I'm correctly addressing your concerns here though. Let me know what you think :) Have a good night! Cheers, |
Thanks. The only piece of information that yo @rodonile added to the conversation is that Paolo was aware of that. That's ok. Just to be clear, my concerns, in order of priority are 3, 1 and 2. About 3)
The offer is still on the table. I've worked with systems that had to validate dataplane traffic which had a lot of spurious and interlaced packets, and needed to validate a lot of parallel and interlaced Control Plane flows using pytest and scapy. I am talking hours of tests, thousands of assertions (coded using pytest), and that was never a problem. Quite the opposite, knowing what you expect as a test designer and checking for exactly that helped to find a lot of bugs - including bugs in the test. Using as input a (very complex) output of your own DUT is a bad idea. About 1)
I am confused... I don't want to block this PR. I also don't have time for this. So I would kindly ask you to remove the non-root containers from the PR, and I leave the decision to merge the PR entirely to Paolo, who at the end is the owner and the maintainer of pmacct. I think having them is better than nothing, yet it seems the best moment to have this discussion and set some good foundations. |
I tried to provide more insight so that you could have the full picture as to why we made those choices.
As I wrote above, you could consider test 400 as an example.
I'm sure this can be done in the way you are suggesting. Just to me it seems more difficult and time consuming to do so, but tbh not having much testing/pytest experience I might be wrong, absolutely. And also this would mean changing most of our testing logic, so would only be feasible in the long term, not short term. I can tell you from our point of view in Swisscom we have been using this test framework since months now for regression testing and it's been quite stable, helping to identify several bugs introduced, and most importantly giving us more peace of mind for the dev/prod deployments (before it was really "deploy and hope for the best"). And of course, we can keep improving/extending it over time and gradually move to an approach with synthetic output generation. What do you think about this? From what I understood from your comments you would prefer the test logic being fully redesigned before merging, and I was skeptical about this mainly for time reasons. Would it be too problematic to move in that direction afterwards?
Ok, fine for me, will remove those. We only have tests for nfacctd, pmbmpd and pmbgpd for now, those can run as non privileged. It's just that in our initial requirements when we were discussing the framework architecture we had rootless execution as a requirement. But it's fine, all those interested in running the tests locally most likely have sudo permissions on their system.
I agree, and I'm open to discuss. Should we organize a meeting? |
Changelog: - Copied in the two split folders, edited required framework files and .gitignore - Modified build_docker_images.sh to find pmacct in root project folder, ran 200:00, 300:00, 400:00 (passed) - add -c argument to pass pytest.ini file (otherwise it's not found and markers give warning) - minor packet-timing related updates to test 302&304 - minor packet-timing related updates to test 202&207
…hange every time and are not checked anyway
Short description
Integrates pytest based test automation framework into pmacct (finally... :-))
Added folders:
For more information about the framework, please refer to the readme file in test-framework folder.
Checklist
I have:
@paololucente please review and let us know if you agree with the integration approach