Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature improved java cataloging #2769

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

GijsCalis
Copy link
Contributor

@GijsCalis GijsCalis commented Apr 11, 2024

As announced in PR #2669 I've improved the package detection for Java/Maven packages by:

I've added support for use of the local Maven cache because it usually it contains all the required pom.xml files, when scanning on a system where the code has been build.

As a result the scanning is significantly more complete and faster, see table below with test results.
I've run the tests on the following projects:

Syft ver project & config time (s.) total pkgs pkg with version licenses found
v1.1.1 httpcomponents with network - 21 12 7
new httpcomponents with network - 21 21 36
v1.1.1 petclinic no network 1.3 24 5 2
v1.1.1 petclinic with network 5.8 24 15 14
new petclinic no network, no local repo 0.8 24 5 2
new petclinic no network, with local repo 1.3 24 23 9
new petclinic with network, with local repo 2.5 24 23 23
v1.1.1 zookeeper with network 33.0 58 18 13
v1.1.1 zookeeper no network 0.2 57 18 0
new new zookeeper no network, no local repo 0.4 57 54 1
new new zookeeper with network, no local repo 7.0 57 57 122
new new zookeeper with network, with local repo 5.0 57 57 122
new new zookeeper no network, with local repo (after maven build) 0.6 57 57 122

Also find attached some SBOM files generated by syft v1.1.1 and the version in this PR.
sbom.cyclonedx.httpcomponents-new.json
sbom.cyclonedx.httpcomponents-v1.1.1.json
sbom.cyclonedx.jackson-new.json
sbom.cyclonedx.jackson-v1.1.1.json
sbom.cyclonedx.zookeeper-new-no-network-with-local-repo-after-build.json
sbom.cyclonedx.zookeeper-v1.1.1-with-network.json
sbom.cyclonedx.petclinic-new-no-network-no-local-repo.json
Uploading sbom.cyclonedx.petclinic-v1.1.1-with-network.json…

Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Update configuration documentation
Improve maven groupid detection

Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
Signed-off-by: Gijs Calis <51088038+GijsCalis@users.noreply.github.com>
@GijsCalis GijsCalis marked this pull request as ready for review April 12, 2024 16:01
@GijsCalis
Copy link
Contributor Author

Note: The 'Detect schema changes / Label changes' failed, but should pass on re-run of the job.

@GijsCalis
Copy link
Contributor Author

GijsCalis commented Apr 18, 2024

@kzantow, @willmurphyscode :
I realise this is quite a large PR, but it also makes significant improvements. Without these improvements syft does not work well enough, especially on Spring Framework based packages, for practical use. The amount of false positives because of missing version number is simply to high. (most of the projects at my company are based on Spring)

I can split this PR into smaller parts, each adding part of the improvements:

  • use of local Maven repository:

    • Greater chance of finding licenses on systems that have build the package, because no network (UseNetwork) is required.
    • Much faster scans when using network and pom file is available locally. And also thus limiting requests to the remote repository.
  • Fix bug in configuration: no default configuration is loaded for the java cataloger (see: https://github.com/GijsCalis/syft/blob/ff1c8431704181ede8edf3325004f3163f3283e6/cmd/syft/internal/options/java.go#L12)

  • Parsing of parent poms for property definitions (to improve resolving of properties) and processing imported managed dependencies (which contain version definitions of dependencies).

What would be the best way forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant