Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check models compatibility in Docker rebuild GH Actions workflow #719

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

juhoinkinen
Copy link
Member

@juhoinkinen juhoinkinen commented Jul 6, 2023

The recently added GH Actions workflow for rebuilding Docker images (#715) could also verify that the models trained on the previous image build work (results-wise) identically in the new image. It is quite undesirable that the models would work even slightly differently in different Docker image builds of the same Annif version.

These are the steps in the workflow that aim to verify the compatibility and results identicality of models:

  1. Train models with all (trainable) algorithms using old image (the one in quay.io with the tag being rebuild)
  2. Evaluate the models and store results in eval.prev.out file using old image
  3. Evaluate the models and store results in eval.out file using new image
  4. Compare eval.prev.out and eval.out with diff, and fail the workflow in the case of difference, unless the box for allowing this is checked when triggering the workflow

For both training and evaluation the tests/corpora/archaeology/fulltext/ corpus is used, which is fine for all algorithms I think. Although there could be some dedicated corpora for this.

Also, there could be a similar workflow for checking models compatibility when preparing an Annif release, instead of doing compatibility checks manually, so the compatibility-check steps could be moved to a separate action for reusability, like the prepare action of CI/CD workflow.

Note: I've been working on this in my own fork, to avoid accidental image pushes to quay.io.

TODO before merge:

  • Switch the image used to compare current build to quay.io/natlibfi/annif from jinkinen/annif

@codecov
Copy link

codecov bot commented Jul 6, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (320af2b) 99.67% compared to head (07f0af7) 99.67%.

❗ Current head 07f0af7 differs from pull request most recent head 569b367. Consider uploading reports for the commit 569b367 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #719   +/-   ##
=======================================
  Coverage   99.67%   99.67%           
=======================================
  Files          89       89           
  Lines        6380     6380           
=======================================
  Hits         6359     6359           
  Misses         21       21           

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sonarcloud
Copy link

sonarcloud bot commented Jul 6, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@juhoinkinen
Copy link
Member Author

When #762 is merged, the upload/download functionality could be utilized for the models compatibility check. By downloading models (maybe from GitHub Actions cache?) to check this first step could be omitted:

  1. Train models with all (trainable) algorithms using old image (the one in quay.io with the tag being rebuild)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant