Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure that npm dependencies retain their "production" grouping #939

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

G-Rath
Copy link
Collaborator

@G-Rath G-Rath commented Apr 23, 2024

This resolves an inconsistency in the scanner output for npm packages that appear in the same tree multiple times but in different groups; this happens because the table outputter deduplicates on groups on the assumption that a package can only appear in a single group which is incorrect for the NPM ecosystem.

To address this, I've introduced an internal map type that ensures groups are merged when a package is added, with the twist that if either instance of a package being merged is in no groups then that is the result of the merge because implicitly that means an instance of the package is in the "production" group which takes priority over the other groups.

This is something that should most likely be improved on the future, but right now this fix should be good enough to ship since afaik it doesn't impact any other ecosystem and groups are something of a PoC given we can't resolve that information richly enough across all ecosystems yet.

I think this technically could impact pnpm as well, but none of our fixtures gave a good indicator and the latest lockfile version doesn't have that information anymore so frankly I'm not worrying about it at this point.

Resolves #924

@codecov-commenter
Copy link

codecov-commenter commented May 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 63.69%. Comparing base (2b165ea) to head (d38705d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #939      +/-   ##
==========================================
+ Coverage   63.62%   63.69%   +0.06%     
==========================================
  Files         146      146              
  Lines       11947    11961      +14     
==========================================
+ Hits         7601     7618      +17     
+ Misses       3878     3876       -2     
+ Partials      468      467       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

// the merge happens almost as you'd expect, except that if either given packages
// belong to no groups, then that is the result since it indicates the package
// is implicitly a production dependency.
func mergeNpmDepsGroups(a, b PackageDetails) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems we only need DepGroups so can the func be

func mergeNpmDepGroups(a, b []string) []string

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could but that'd mean every caller of this function would have to explicitly access DepGroups themselves - so it's cleaner for us to take PackageDetails and do that accessing ourselves.


for name, detail := range m1 {
details[name] = detail
type npmPackageDetailsMap map[string]PackageDetails
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment stating that the key is name and version

@oliverchang oliverchang requested review from michaelkedar and removed request for josieang and hogo6002 May 1, 2024 06:32
Copy link
Collaborator

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer this to @michaelkedar, who is our npm expert. WDYT of this approach?

@michaelkedar
Copy link
Member

Looking a bit into this, there's another awkward edge case with optional prod dependencies & non-optional dev dependencies (devOptional) that we might want to consider.

dev, optional, devOptional: If the package is strictly part of the devDependencies tree, then dev will be true. If it is strictly part of the optionalDependencies tree, then optional will be set. If it is both a dev dependency and an optional dependency of a non-dev dependency, then devOptional will be set. (An optional dependency of a dev dependency will have both dev and optional set.)

https://docs.npmjs.com/cli/v9/configuring-npm/package-lock-json#packages

Our groups parser doesn't seem to get this correct at the moment:

func (pkg NpmLockPackage) depGroups() []string {
if pkg.Dev {
return []string{"dev"}
}
if pkg.Optional {
return []string{"optional"}
}
if pkg.DevOptional {
return []string{"dev", "optional"}
}
return nil
}

So I guess we ought to differentiate between "optional in prod, required in dev" (devOptional) and "not in prod, optional in dev" (dev && optional) somehow, and also apply that to the deduplication logic.

This whole thing kind of comes to some thoughts I've been having on counting vulnerabilities and how or whether we should be deduplicating vulnerabilities in an npm lockfile in the first place. I feel like it'd generally be clearer to have one entry per vulnerability ID, with a complete list of instances of affected packages.
Maybe that's a thing to add to the V2 Wishlist.

@G-Rath
Copy link
Collaborator Author

G-Rath commented May 2, 2024

I don't think I disagree with that but frankly it makes my head spin a bit 😅

Personally I feel like it's probably safest to at least start by just treating groups as part of the uniqueness of a package - I'm pretty sure for most ecosystems that won't change anything because you can only have instance of a package regardless of group, and for ecosystems like NPM it'd mean you would just end up with some more packages in the output.

Then we can work on how to make the output smarter without worrying about the current version of the cli giving false negatives in the meantime

@G-Rath
Copy link
Collaborator Author

G-Rath commented May 2, 2024

note to self: consider exploring if an empty string could be useful here - i.e. if a package is not optional or dev, then its ""

(though thinking about it, I think that could just lead back to this same solution - I think either way, its probably correct more as it needs to be handled within the extractor rather than at the outputter level right now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scan groups are inconsistent when there are duplicate packages in npm
5 participants