Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make test log parsing robust #3187

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

heplesser
Copy link
Contributor

This PR fixes #3155. It catches errors in parsing XML files generated by tests, displays them in the results table exits with non-zero exit code as if tests had actually failed.

@heplesser heplesser added T: Bug Wrong statements in the code or documentation S: Critical Needs to be addressed immediately I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Apr 22, 2024
@heplesser heplesser added this to To do in Build system and CI via automation Apr 22, 2024
@heplesser
Copy link
Contributor Author

@terhorstd @lekshmideepu Ping!

Copy link
Contributor

@terhorstd terhorstd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest some small refactoring to reduce cross-code dependencies and code complexity due to introduction of "another case".

Comment on lines 79 to 82
except Exception as err:
msg = f"ERROR: {pfile} not parsable with error {err}"
results[ph_name] = {"Tests": 0, "Skipped": 0, "Failures": 0, "Errors": 0, "Time": 0, "Failed tests": [msg]}
totals["Failed tests"].append(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this exception not handled inside parse_result_file()? If the exception can be handled as is done here, it should be handled where it is first encountered; otherwise every caller of parse_result_file needs to implement it. Seems the results can be handled quite symmetrically to the "normal" return values, which would be even more reason to move the try … except inside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with next push.

totals[k] += v
except Exception as err:
msg = f"ERROR: {pfile} not parsable with error {err}"
results[ph_name] = {"Tests": 0, "Skipped": 0, "Failures": 0, "Errors": 0, "Time": 0, "Failed tests": [msg]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
results[ph_name] = {"Tests": 0, "Skipped": 0, "Failures": 0, "Errors": 0, "Time": 0, "Failed tests": [msg]}
results[ph_name] = {"Tests": 0, "Skipped": 0, "Failures": 0, "Errors": 1, "Time": 0, "Failed tests": [msg]}

If we mark the "Failed tests" entry also in the "Errors" column, we could keep len(results["Failed tests"]) == results["Errors"] + results["Failures"], which seems like an easy to sustain symmetry and follows the principle of least surprise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to have the failed and errored tests separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When parsing of the XML file fails, we lose all information about an entire phase of the testing process. We do not even know how many tests were in this phase and could therefore be considered "Failed". Indeed, in all cases I have seen so far, the tests actually all passed, we just could not parse this because the XML file was corrupted.

Therefore, we need to handle parse failures differently and cannot include them in the other statistics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exact number of failures is not relevant, since the type of failures determines criticality and required actions. So I think it's ok to have parsing failure recorded as "one failure". The same happens in other cases where we treat one wrong result different from any number of other errors. If there is at least one failure in a stage we need to look into it and understand what this means anyway. It's not possible to dismiss a number of failures just because the number is small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lekshmideepu, yes, that's why there are Errors and Failures. My point is that I think we do not need a third type that requires extra condition checks when interpreting the results. For me it would be intuitively clean to assert correct function of the testsuite by an

assert results["Failures"] + results["Errors"] == len(results["Failed tests"])

Where the list of failed tests contains messages for each of the errors and failures. The level on which something went wrong is not known for each failure anyway. If a test fails, this can be a problem anywhere between external libraries, nest code or broken test or test-suite. This is not special for a parsing error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terhorstd Thanks for the clarification. I thought it differently. I agree with your point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terhorstd In Pytest, "error" and "failure" have specific meanings (see https://stackoverflow.com/questions/21513718/pytest-error-vs-fail): If an assertion is triggered while a test_...() is executed, this counts as a failure, while a failed assertion outside a test_, especially in fixtures, is counted as an error. The sum of passes, failures, errors and skips should always equal the number of tests in total. This should apply for each row in our table and the totals at the bottom.

If we cannot parse an XML results file, we have no information and just counting that as an error would break the logic of the PyTest result counts. Therefore, I still think that it is most sensible to treat XML parsing errors separately from PyTest result tabulation.

Comment on lines 105 to 107
if pr["Tests"] == 0 and pr["Failed tests"]:
print(f"{'--- XML PARSING FAILURE ---':^{len(cols) * col_w}}")
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary if marked as in previous comment. The extra XML PARSING FAILURE string can be absorbed into msg and I don't see why centering necessary in addition to all-caps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the all caps. As explained above, we need to treat this case special. And I find the centred message much more visually pleasing :).

testsuite/summarize_tests.py Show resolved Hide resolved
lekshmideepu

This comment was marked as duplicate.

Copy link
Contributor

@lekshmideepu lekshmideepu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo the link for posting the issues. LN: 130
nest-simulator instead of nest_simulator in the below link.
print(" https://github.com/nest/nest_simulator/issues")

@heplesser
Copy link
Contributor Author

There is a typo the link for posting the issues. LN: 130 nest-simulator instead of nest_simulator in the below link. print(" https://github.com/nest/nest_simulator/issues")

Thanks, fixed with next push.

Copy link
Contributor

@lekshmideepu lekshmideepu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @heplesser Looks good to me 👍

@gtrensch gtrensch moved this from To do to Review in Build system and CI May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: Critical Needs to be addressed immediately T: Bug Wrong statements in the code or documentation
Projects
Development

Successfully merging this pull request may close these issues.

Failing MPI tests not properly detected
3 participants