Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract full license text #2724

Open
mmarseu opened this issue Mar 19, 2024 · 5 comments
Open

Extract full license text #2724

mmarseu opened this issue Mar 19, 2024 · 5 comments
Labels
enhancement New feature or request license relating to software licensing

Comments

@mmarseu
Copy link

mmarseu commented Mar 19, 2024

What would you like to be added:
SBOM formats such as CycloneDX and SPDX support including the full text of a license with a component. It would be great if syft could extract this information when scanning for licenses.

Why is this needed:
OSS license compliance is one important use case for SBOMs, especially in large enterprises. SBOMs produced by syft today include components with licenses identified by name (not SPDX ID) which is mostly useless without the accompanying text.

Comment #2002 (comment) has also asked for such a feature to be implemented, however, I believe it was eventually overlooked when the corresponding issue was closed.

Additional context:
Example for curl produced by dpkg cataloger in CycloneDX (modified for conciseness):

{
    "type": "library",
    "name": "curl",
    "version": "7.81.0-1ubuntu1.15",
    "licenses": [
        // snip
        {
            "license": {
                "name": "other"
            }
        },
        {
            "license": {
                "name": "public-domain"
            }
        }
    ],
    "purl": "pkg:deb/ubuntu/curl@7.81.0-1ubuntu1.15?arch=amd64&distro=ubuntu-22.04",
    "properties": [
        // snip
        {
            "name": "syft:location:0:path",
            "value": "usr/share/doc/curl/copyright"
        },
        {
            "name": "syft:location:1:path",
            "value": "var/lib/dpkg/info/curl.md5sums"
        },
        {
            "name": "syft:location:2:path",
            "value": "var/lib/dpkg/status"
        }
    ],
    // snip
},
@mmarseu mmarseu added the enhancement New feature or request label Mar 19, 2024
@italvi
Copy link

italvi commented Mar 20, 2024

Maybe a good source for licenses like public-domain could be the ScanCode LicenseDB, as unfortunately SPDX will not add such an ID to their list.

@wagoodman wagoodman added the license relating to software licensing label Mar 21, 2024
@tgerla
Copy link
Contributor

tgerla commented Mar 21, 2024

Hi @mmarseu, thanks for the suggestion. We think it makes sense to include full license text or license snippets where available, as an opt-in configuration. We've got some more design work to do but we'll put this issue in the backlog for implementation at some point. If you're interested in working on this, let us know and we can collaborate. Thanks!

@wagoodman
Copy link
Contributor

dev note: we could start adding full license text, when filename/contents are detected to be licenses, or partial license text within a file. These could be persisted on ** file ** object in the SBOM, not the ** package ** object.

@mmarseu
Copy link
Author

mmarseu commented Mar 22, 2024

Hi @mmarseu, thanks for the suggestion. We think it makes sense to include full license text or license snippets where available, as an opt-in configuration. We've got some more design work to do but we'll put this issue in the backlog for implementation at some point.

Thank you so much! Looking forward to a solution :)

If you're interested in working on this, let us know and we can collaborate. Thanks!

Sadly, I wouldn't be able to write a hello world in go if my life depended on it 😅

@Joerki
Copy link

Joerki commented Mar 29, 2024

Please let me add that the presence of copyright information is also a signficant legal obligation to mention when software vendors publish their work in an attribution report.
In case this information is not provided in package metadata, this information should be provided and maybe extracted to supply then in the SBOM component data. Does it make sense to consider this aspect in this issue as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request license relating to software licensing
Projects
Status: Backlog
Development

No branches or pull requests

5 participants