Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/analysis: add Pass.Module field #66315

Open
adonovan opened this issue Mar 14, 2024 · 26 comments
Open

x/tools/go/analysis: add Pass.Module field #66315

adonovan opened this issue Mar 14, 2024 · 26 comments
Assignees
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Mar 14, 2024

Proposal Details

Some analyzers need information about the module to which a package belongs, such as its version of Go. We have already exposed this information in x/tools/go/packages (and I observe that most features of packages.Package that are plausibly build system-agnostic eventually end up needing to be exposed through analysis.Pass too).

I propose that we define a Pass.Module field, something like this: [see bottom of note for most current proposal]

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	Replace   *Module      // replaced by this module
	Time      *time.Time   // (optional) time version was created
	Main      bool         // is this the main module?
	Indirect  bool         // is this module only an indirect dependency of main module?
	Dir       string       // directory holding files for this module, if any
	GoMod     string       // path to go.mod file used when loading this module, if any
	GoVersion string       // go version used in module
	Error     error        // error loading module
}

This set of fields exactly matches what is in packages.Module. Clearly this is the maximal set of fields, since a go/packages-based driver will not be able to set any additional (non-derived) fields. But perhaps a more minimal set might be a safer starting point; suggestions welcome.

Build systems other than go list, such as Bazel, Blaze, Pants, Buck, Please and so on, will need to simulate the module information if they do not use go.mod files. I am interested to hear from maintainers of Go integrations in those systems whether this set of fields looks manageable. See:

Update (Apr 4):

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}
@fmeum
Copy link

fmeum commented Mar 15, 2024

Thanks for the heads-up.

Bazel's rules_go has adopted go.mod as the source of truth for dependency declarations with Bazel's new dependency management system Bzlmod (https://github.com/bazelbuild/rules_go/blob/master/docs/go/core/bzlmod.md#specifying-external-dependencies).

We would thus be able to fill these fields (maybe except for Time, but it is optional).

@timothy-king
Copy link
Contributor

My suggestion for a minimal set of fields {Path, Version, GoVersion}.

@timothy-king
Copy link
Contributor

Should we make Pass.Module an optional feature only to be filled in if it is requested by a boolean field on the Analyzer? Such as RequiresModule. This gives an understandable default of what to do when Module.Error != nil. If RequiresModule is on and Module.Error != nil, skip the pass and warn the user of an error.

@adonovan
Copy link
Member Author

adonovan commented Mar 18, 2024

That wouldn't be a compatible change. If we add the field, then a driver that hasn't implemented support for it yet will run analyzers unconditionally, despite their RequiresModule declaration, so they will crash.

Drivers and analyzers must both be able to handle missing module information.

@timothy-king
Copy link
Contributor

Good point. What do we want to do when Module.Error != nil then? Some obvious options:

  1. Not run the Pass under the premise there is an error on the package?
  2. Not provide the Pass the Module value, i.e. act as if it is missing?
  3. Provide the Pass the value of Module.Error to decide for itself?
  4. Something else?

The first option seems like the easiest to relax in the future.

@adonovan
Copy link
Member Author

Option 3. If the pass is running, it means we got a package (possibly with errors if RunDespiteErrors). The analyzer can deal with them.

@dr2chase
Copy link
Contributor

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016. Perhaps that CL should be revised, @michaelmatloob suggested a design document or issue to link to, neither of which yet exists, which is why I have not yet submitted.

There's no guarantee (as I understand it) that packages from which a vet analysis imports facts, have themselves been analyzed to ensure that those facts are exported. I verified that the export-import worked if I pre-ran the analyses by hand, but if I did not do that, then there are no facts to import.

@adonovan
Copy link
Member Author

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016. Perhaps that CL should be revised, @michaelmatloob suggested a design document or issue to link to, neither of which yet exists, which is why I have not yet submitted.

This proposal discussion will determine whether we go for a "maximalist" approach (plumb through everything from go/packages.Module) or a minimalist one (just module path and version). Either way, your linked CL will need to specify at least one additional field.

There's no guarantee (as I understand it) that packages from which a vet analysis imports facts, have themselves been analyzed to ensure that those facts are exported. I verified that the export-import worked if I pre-ran the analyses by hand, but if I did not do that, then there are no facts to import.

There is such a guarantee, except for the standard library. (That unfortunate exception is because the std lib is precompiled in Blaze and Bazel, not built on demand.) In other words, if a driver makes a pass of analyzer A on package P, where A uses facts and P imports Q, then the driver must already have executed pass (A, Q) successfully, unless Q is in std.

@timothy-king
Copy link
Contributor

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016.

@dr2chase Definitely related. For that CL to help an Analyzer, this proposal or something similar will need to be approved to give that data a place to be plumbed to. You may want to wait on this being approved.

It would also help to know what the minimal data you need is. Also your 2 cents on maximal interface or minimal interface would be appreciated.

@peterebden
Copy link

Thanks for looping us in on this!

Please's Go plugin has most of the module information available to it; these days it's typically synced from the go.mod file using Puku. Hence I expect we shouldn't have much trouble producing this information.

The only fields I expect we would be unlikely to fill in are Time (which is optional anyway), Replace (although there are similar mechanisms, we don't really have a first-class concept of it) and possibly Indirect (we have that information at some level but I don't know how available it will be to this analysis). I'd anticipate that's unlikely to cause any major issues?

Overall the proposal LGTM.

@timothy-king
Copy link
Contributor

I'd anticipate that's unlikely to cause any major issues?

If the fields are not optional, Analyzers can assume they are always provided. If not present, the Analyzer could crash (nil-pointer dereference) or otherwise misbehave.

The only fields I expect we would be unlikely to fill in are Time (which is optional anyway), Replace (although there are similar mechanisms, we don't really have a first-class concept of it) and possibly Indirect (we have that information at some level but I don't know how available it will be to this analysis).

@peterebden Is it fair to say that this is a request that Module.{Time,Replace,Indirect} are optional or not present?

@dr2chase
Copy link
Contributor

dr2chase commented Apr 1, 2024

@timothy-king Sorry for the late reply, been out-of-hemisphere and hacking on other problems also.

The information that (I think) I need is just Path and Version, but I am a little worried about the fields that I don't know for sure that I understand (that might affect the effective version, for example). So if in fact I need those fields to correctly answer "are these two packages from the same module or not?" or "what Go version applies to compilation of this package?" then they should be included, otherwise, if nobody has a use for them yet, then maybe wait. Maybe.

I approve of it being a pointer-to-a-thing, so that "nope, no module info here" can be signaled with a nil pointer.

One thing to remember is that we'll want to fill this in, in the .cfg that vet passes to analysis passes. I have a pair of CLs for doing this with fragileconversion and I can verify that with that change it works, without it does not (unitchecker, vet).

I went to look at the full set of module information available to go-build-vet-etc, and saw

type ModulePublic struct {
	Path       string           `json:",omitempty"` // module path
	Version    string           `json:",omitempty"` // module version
	Query      string           `json:",omitempty"` // version query corresponding to this version
	Versions   []string         `json:",omitempty"` // available module versions
	Replace    *ModulePublic    `json:",omitempty"` // replaced by this module
	Time       *time.Time       `json:",omitempty"` // time version was created
	Update     *ModulePublic    `json:",omitempty"` // available update (with -u)
	Main       bool             `json:",omitempty"` // is this the main module?
	Indirect   bool             `json:",omitempty"` // module is only indirectly needed by main module
	Dir        string           `json:",omitempty"` // directory holding local copy of files, if any
	GoMod      string           `json:",omitempty"` // path to go.mod file describing module, if any
	GoVersion  string           `json:",omitempty"` // go version used in module
	Retracted  []string         `json:",omitempty"` // retraction information, if any (with -retracted or -u)
	Deprecated string           `json:",omitempty"` // deprecation message, if any (with -u)
	Error      *ModuleError     `json:",omitempty"` // error loading module
	Sum        string           `json:",omitempty"` // checksum for path, version (as in go.sum)
	GoModSum   string           `json:",omitempty"` // checksum for go.mod (as in go.sum)
	Origin     *codehost.Origin `json:",omitempty"` // provenance of module
	Reuse      bool             `json:",omitempty"` // reuse of old module info is safe
}

and was curious about whether any of this should be included, if it happens to be available in an analysis framework.
Deprecated seemed relevant, perhaps.

@adonovan
Copy link
Member Author

adonovan commented Apr 4, 2024

Let's stick with the subset of fields Tim proposed: {Path, Version, GoVersion}; the rest are more obscure. We can always add more later when a clear need arises.

I've appended the updated API proposal to the bottom of the first comment.

Any objections?

@rsc
Copy link
Contributor

rsc commented Apr 4, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Apr 10, 2024

Have all remaining concerns about this proposal been addressed?

The proposal is to add

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}

@rsc
Copy link
Contributor

rsc commented Apr 24, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The proposal is to add

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}

@timothy-king
Copy link
Contributor

timothy-king commented Apr 26, 2024

I have a candidate implementation here https://go.dev/cl/577996.

One thing I observed was that Pass.Module.GoVersion Pass.Module.Version is empty on locally checked out packages, and populated for dependencies on other modules. I believe this is WAI and what one would expect from go list. It may be confusing to have different results on the "same" package though.

IMO what we should do is document that "" means unknown and that this includes local packages so Analyzer writers are aware of this. Alternatively this could be a reason to have Pass.Module == nil to signal 'unknown'.

@dr2chase
Copy link
Contributor

dr2chase commented May 6, 2024

I had imagined Pass.Module == nil to signal "no information".

@timothy-king
Copy link
Contributor

timothy-king commented May 7, 2024

@dr2chase We need to make a decision about the partial information case. We have Pass.Module.{Path,GoVersion} but not Pass.Module.Version == "" for local packages. Or we can try to have Pass.Module==nil in this case. This is not "no information" just "some information is unknown". So I'm not quite sure if your comment is a vote in either direction.

@dr2chase
Copy link
Contributor

dr2chase commented May 7, 2024

If you have some information, I vote for not-nil. I thought not-nil was being proposed for the actually-there-is-no-module-at-all case. For same/different module questions, path is useful.

@adonovan
Copy link
Member Author

adonovan commented May 7, 2024

I thought not-nil was being proposed for the actually-there-is-no-module...

(I assume you mean "nil" here.)

I think reporting partial information may be ok, but "nil" cannot mean exactly "there is no module", because for the locally edited main module, we do not have any module information even though the module surely does exist and has a name and presumably a pseudoversion.

@rsc
Copy link
Contributor

rsc commented May 8, 2024

In module mode there really is always a module. We should make Module != nil always, and leave out the fields that we don't have info about. Even &Module{"", "", ""} is fine. Then there are no nil pointer problems.

@rsc
Copy link
Contributor

rsc commented May 8, 2024

@adonovan points out that code other than x/tools generates Pass structures when running analyzers. Those will not know about Module, so analyzers will have to handle Module==nil. But we can still arrange that Module!=nil for all the analyzer-runners that we own.

@rsc
Copy link
Contributor

rsc commented May 8, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The proposal is to add

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}

@rsc rsc changed the title proposal: x/tools/go/analysis: add Pass.Module field x/tools/go/analysis: add Pass.Module field May 8, 2024
@rsc rsc modified the milestones: Proposal, Backlog May 8, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/577996 mentions this issue: go/analysis: Add modules to Pass

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/580076 mentions this issue: cmd/go: add module information to vet actions

gopherbot pushed a commit that referenced this issue May 13, 2024
Update #66315

Change-Id: Ica9b7e010ea9a0a12f80cc83b8ace51f22822ec2
Reviewed-on: https://go-review.googlesource.com/c/go/+/580076
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests

7 participants