Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

License not pickedup for binaries like java (openjdk), node (nodejs) #2765

Open
mithunms333 opened this issue Apr 10, 2024 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@mithunms333
Copy link

mithunms333 commented Apr 10, 2024

What happened:
I ran syft scan on a container image which has java tar binaries downloaded (not installed as rpm linux packages) and placed from openjdk (downloaded from github - adoptium). The SBOM json (SPDX, CycloneDX) lists the binary component with name 'java' and its correct version, location. But its license is not picked up. There is a LICENSE file at the path: '.../openjdk/legal/java.base/LICENSE'.
I believe the issue is same for ALL binaries of all types, whether java or nodejs, and from all github projects/vendors.

Additional trials details:
I also tried the following ideas, but they didnt work:
I went through this syft source code go class 'syft/internal/licenses/list.go', and according to its list, I kept copies of the LICENSE file with renamed versions hoping that some name will get picked up by syft i nsome folder, and at all folders such as:
'.../openjdk/legal/java.base/LICENSE'
'.../openjdk/bin/LICENSE'
'.../openjdk/LICENSE'

These trials did not succeed.

What you expected to happen:
License value GPLv2+ should have been picked up and included in json SBOM files. But it did not happen. SPDX fields for license show as 'NOASSERTION'.

Steps to reproduce the issue:
create a simple linux image by downloading the openjdk tar binaries from github adoptium. then run syft scan on it, generating SPDX or cyclonedx json output format. check the license field values for that component in generated outptu SBOM file.

Anything else we need to know?:
I believe the issue is same for ALL binaries of all types, whether java or nodejs, and from all github projects/vendors.

Environment:

  • Output of syft version:
    Application: syft
    Version: 0.99.0
    BuildDate: 2023-12-21T16:18:46Z
    GitCommit: 3cffa0b
    GitDescription: v0.99.0
    Platform: linux/amd64
    GoVersion: go1.21.5
    Compiler: gc
  • OS (e.g: cat /etc/os-release or similar): RHEL 8.9 / UBI minimal - series 8 or 9 - any.
@mithunms333 mithunms333 added the bug Something isn't working label Apr 10, 2024
@spiffcs spiffcs added the enhancement New feature or request label Apr 11, 2024
@willmurphyscode willmurphyscode removed the bug Something isn't working label Apr 11, 2024
@mithunms333
Copy link
Author

Dear team,
I understand that this ask/issue is tagged for enhancement. Until that change is delivered in the product, I still need the license names to be picked up by syft in my processes. Is there any change or manual work around I can do at my send to overcome this? - such as in which folder should I keep the LICENSE file for these downloaded binaries to make syft pick it up. WIll be much helpful!

@tgerla
Copy link
Contributor

tgerla commented Apr 18, 2024

Hi @mithunms333, unfortunately we don't have a ready workaround for you in this case. We are discussing some improvements the binary catalogers and how to handle some special cases like the JDK and JRE. We do have another issue discussing a possible framework for "hints" that would give you some tools to customize the output of the SBOM on a per cataloger basis: #31

I will go ahead and keep this issue open for you until we have a resolution, and if you need anything else please feel free to open another issue.

@kzantow
Copy link
Contributor

kzantow commented Apr 18, 2024

Developer note: after a discussion about implementing this feature, we think the following approach may work reasonably well and help to scale the binary classifiers without the need to add individual catalogers for each case:

  • Add a configuration to the binary classifiers which allows post-processing after a package has been identified
  • Specifically for licenses, a function to locate and identify license may be added that allows a relative path (and/or possibly absolute path) to be specified to find license information present on the system.

An example of how this might look is (naming and exact details TBD, of course):

		{
			Class:    "java-binary-oracle",
			FileGlob: "**/java",
			EvidenceMatcher: FileContentsVersionMatcher(
					`(?m)\x00(?P<version>[0-9]+[.0-9]+[+][-0-9]+)\x00`),
			Package: "java/jre",
			PURL:    mustPURL("pkg:generic/java/jre@version"),
			CPEs:    singleCPE("cpe:2.3:a:oracle:jre:*:*:*:*:*:*:*:*"),
			Append:  licenseFromFiles("../legal/java.base/LICENSE", "./LICENSE"),
		},

So, in the event that a matching package is discovered by this cataloger, a secondary set of functions may run to append additional information to the package, in this example appending any license information found based on the paths relative to where the binary was located.

@mithunms333
Copy link
Author

mithunms333 commented Apr 19, 2024

Hi @kzantow
Sharing the path locations for openjdk:
In openjdk downloaded tar from github, the LICENSE file will present at:
.../openjdk/legal/java.base/LICENSE

java binary executable will be foudn at:
.../openjdk/bin/java

there would be few other supporting jars- probably applicable to same LICENSE at:
.../openjdk/lib/*.jar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Ready
Development

No branches or pull requests

5 participants