Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

License detection diffs are incorrect #191

Open
Zach-Johnson opened this issue Nov 16, 2023 · 2 comments
Open

License detection diffs are incorrect #191

Zach-Johnson opened this issue Nov 16, 2023 · 2 comments

Comments

@Zach-Johnson
Copy link

It looks to me like license detection diff detection is currently failing. For example, I've added a package to a testing repository and I see this reflected in the diff

{
     "status": "added",
     "factors": [],
     "score": 100,
     "new": {
       "path": "project/vendor/github.com/AndreasBriese/bbloom/LICENSE",
       "type": "file",
       "name": "LICENSE",
       "size": 1671,
       "sha1": "73e9520e4dfbadc8e525d8f38dff93a62f8623fb",
       "fingerprint": "",
       "original_path": "project/vendor/github.com/AndreasBriese/bbloom/LICENSE",
       "licenses": [],
       "copyrights": []
     },
     "old": null
}

however the licenses field is empty -- I would expect this to contain a reference to the new license I think.

I see that there was a large PR on scancode that probably modified the structure: https://github.com/nexB/scancode-toolkit/pull/2961/files and I'm guessing that broke this. I'm happy to work on a fix for this if this project is still under development and will accept PRs

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Nov 16, 2023

@Zach-Johnson thanks for reporting and offering to help!

We indeed made a large upgrade on the LicenseDetection side on scancode-toolkit which had a lot of breaking changes, and then missed to update this part of deltacode, which has caused this.
We now have a license_detections field for each resource instead of the licenses before, and this is a list of LicenseDetection objects which has now a identifier which can be used to see if the detections have changes or not. As two same detections have the same identifier, the identifier having an UUID created with the match contents within. We also have a scan-level license detections list with these identifiers for all unique license detections.

See also https://scancode-toolkit.readthedocs.io/en/stable/reference/license-detection-reference.html for more info on why we added these changes and https://github.com/nexB/scancode-toolkit/blob/develop/CHANGELOG.rst#license-detection for the CHANGELOG on this.

I'm happy to work on a fix for this if this project is still under development and will accept PRs

That's great! Please ask if you need any help in doing this/have any questions, will be very happy to help you update deltacode to work with latest SCTK!

@Zach-Johnson
Copy link
Author

@AyanSinhaMahapatra I've started a draft PR here: #192.
I'm not clear about a couple things, I'll move the discussion to that PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants