Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add content validation tests #1163

Open
ivanayov opened this issue May 11, 2022 · 4 comments
Open

Add content validation tests #1163

ivanayov opened this issue May 11, 2022 · 4 comments
Labels
CI/CD Continuous Integration and Continuous Delivery feature new feature

Comments

@ivanayov
Copy link
Contributor

Currently the test coverage verifies that tern runs successfully, but doesn't look at the generated contents.

This proposal suggests the following coverage:

  1. SPDX

    • SPDX json

      • Verify that packages contain "name", "SPDXID", "versionInfo", "downloadLocation", "filesAnalyzed", "licenseConcluded", "licenseDeclared", "copyrightText", "checksums" if available and they refer to expected values
      • Verify that "NOASSERTION" and "NONE" values are properly set
      • Verify that if LicenseRef is used, hasExtractedLicensingInfos contains the proper values and doesn't contain corresponding extracted info otherwise
      • Verify that the relationships tree is as expected
      • Verify that spdxVersion is properly generated
    • SPDX tag/value

      • Similarly
  2. Verify CycloneDX format

  3. Verify that HTML, YAML, JSON, human readable formats are properly generated

Am I missing some specific use-case?

@rnjudge rnjudge added feature new feature CI/CD Continuous Integration and Continuous Delivery labels May 12, 2022
@rnjudge
Copy link
Contributor

rnjudge commented May 13, 2022

Hi @ivanayov!

Thanks for the issue! This is something that we have struggled with in our CI testing. It's hard to set expected values for container image metadata in the CI as images change/are updated frequently. Even if we specify the digest of a container image to use, we would have to keep our tests up to date as base images become outdated/unavailable using the pinned versions we've specified in the CI tests.

If you want to look in to this, though, we would welcome it and I think the functionality would be worthwhile :) Maybe start by taking a look at common images to see how long they're typically available and what type of metadata is available in them to get a sense for the ongoing maintenance required for this? I don't think I've found an image yet where every piece of metadata would be available to verify, but there might be one with most of the metadata fields present.

My suggestion would also be to pick one or two packages to verify metadata for within the image. Doing it for all of the packages in an image will be a lot of hard-coding values and likely to incur significant technical debt in the future.

As far as the particular fields you're looking at verifying:

Verify that HTML, YAML, JSON, human readable formats are properly generated

I don't think we need to individually verify all of the basic report formats. All of the report formats utilize the same metadata from Tern's data model so as long as we chose one there's not a lot of added value for each additional format we verify. Maybe we start with JSON since more information is available in that report than the default and it would be easier to parse. Probably would want to start by verifying a few package names, versions, licenses and perhaps the package format? We could always add verification but the more fields where we validate a value, the more places we need to manually track and update the value if it changes in the container image we use for testing.

As far as verifying the SPDX format, I think we could rely on the verification of the JSON report values for this, at least to start. We have a test in the CI that validates that the SPDX document created is valid and that seems sufficient given that the metadata in the SPDX document would be the same info as a JSON report.

Verify CycloneDX format

This is probably worthwhile for some fields unique to CDX reports (like purls), but re-checking the same package name/version fields is not necessary.

What do you think?

Also tagging @nishakm on this.

@ivanayov
Copy link
Contributor Author

Related to #933

@ivanayov
Copy link
Contributor Author

Related to #1060

@ivanayov
Copy link
Contributor Author

Given that we already have SPDX document validation, I agree on focusing on JSON only, based on SPDX + adding tests for the CDX specific data.

This means also that tests should be updated after every SPDX release.
I'd suggest following a straightforward approach for now and once we have the temporary multiple 2.2 and 2.3 support, to see how validation can be approached to minimise version updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Continuous Integration and Continuous Delivery feature new feature
Projects
None yet
Development

No branches or pull requests

2 participants