Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gazelle): pure golang helper #1895

Merged
merged 12 commits into from
May 20, 2024
Merged

Conversation

hunshcn
Copy link
Contributor

@hunshcn hunshcn commented May 13, 2024

Remove gazelle plugin's python deps and make it hermetic. No more relying on the system interpreter.

Use TreeSitter to parse Python code and use https://github.com/pypi/stdlib-list to determine whether a module is in std lib.

Fixes #1825
Fixes #1599
Related #1315

@hunshcn hunshcn force-pushed the feat/go-helper branch 5 times, most recently from edbbe93 to 15f80a4 Compare May 14, 2024 06:56
@hunshcn hunshcn marked this pull request as ready for review May 14, 2024 07:34
@hunshcn hunshcn force-pushed the feat/go-helper branch 2 times, most recently from 08b3a7a to b7080bc Compare May 14, 2024 07:41
Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is very exciting, so thank you very much for working on this.

Could you change the PR description to remove the words "pain in the ass", please? This will become a commit message and I would love to keep the git commit messages more formal.

gazelle/def.bzl Outdated
downloaded_file_path = "3.11.txt",
)

non_module_deps = module_extension(implementation = non_module_deps_impl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use bazel-skylib 1.6.1 you could use the same as was done in #1892.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#
# See following for more info:
# https://github.com/bazelbuild/bazel-gazelle/issues/1513
embedsrcs = [":helper.zip"], # keep
embedsrcs = ["stdlib_list.txt"], # keep
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include more lists and switch based on a value in the gazelle manifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a simple matter. Because there may be multiple python version in monorepo, this may require a new directive. I suggest keeping the status quo in this pr and using stdlist of 3.11.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, could you please create an issue in the backlog to add this feature? I agree that having a directive or solving this issue in some way would be useful.

gazelle/python/BUILD.bazel Show resolved Hide resolved
gazelle/python/file_parser.go Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
gazelle/def.bzl Outdated Show resolved Hide resolved
@hunshcn hunshcn changed the title feat(gazelle)!: pure golang helper feat(gazelle): pure golang helper May 15, 2024
Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, a few minor additions.

gazelle/WORKSPACE Show resolved Hide resolved
gazelle/python/BUILD.bazel Outdated Show resolved Hide resolved
gazelle/python/file_parser.go Show resolved Hide resolved
gazelle/python/file_parser_test.go Outdated Show resolved Hide resolved
@aignas
Copy link
Collaborator

aignas commented May 15, 2024

@dougthor42, @linzhp, could you please test this PR in your repos? Does this speed things up? Does it slow things down? Since you two have bigger repos, it would be nice to use them as tests to ensure there are no hidden edge cases.

@linzhp
Copy link
Contributor

linzhp commented May 15, 2024

Very excited to see this! I will test it in Uber later this week if that's ok.

@dougthor42
Copy link
Contributor

Oh this looks really great! Let's see how it does...

/cc @ssmall, another Googler who will be using this.

Summary

This branch causes Gazelle to take ~35% of the time, a savings of 40s for me. Hot damn that's awesome!

Before this PR After this PR Delta %
60.8 s 21.2 s 40.6 s ~35%

Background info:

  • 4731 python files. find src -type f -name "*.py" | wc -l
  • 1618 directories. find src -type d -not -name "__pycache__" | wc -l
  • 499 BUILD.bazel files previously generated by Gazelle. find src -type f -name "BUILD.bazel" | wc -l
  • ~220 defined Python requirements in requirements.in
  • bazel version: 7.1.1
  • 24 core AMD EPYC vCPU
  • 96 GB RAM
  • SSD
  • Using experimental bazel downloader for pip.parse? yes.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux rodete
Release:        n/a
Codename:       rodete

Test:

  1. Run hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'
    • only 5 runs instead of the default 10 because the old method takes >1min and I'm impatient.
    • warmup allows bazel downloader to cache my project's python dependencies
  2. Make sure that the git status of the test project is clean.

Before this PR:

Commit 55f31a3

$ hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'
Benchmark 1: bazel run //:gazelle
  Time (mean ± σ):     60.756 s ±  0.867 s    [User: 53.651 s, System: 7.891 s]
  Range (min … max):   60.018 s … 61.873 s    5 runs

After this PR:

Commit f6a190a

$ hyperfine --warmup 1 --runs 5 'bazel run //:gazelle'                                                                                                                                                                                                                                                                        
Benchmark 1: bazel run //:gazelle
  Time (mean ± σ):     21.190 s ±  0.174 s    [User: 27.083 s, System: 2.679 s]
  Range (min … max):   20.964 s … 21.392 s    5 runs

gazelle/python/BUILD.bazel Outdated Show resolved Hide resolved
@hunshcn hunshcn force-pushed the feat/go-helper branch 3 times, most recently from 0272c96 to 7a107e7 Compare May 16, 2024 14:26
@@ -42,9 +41,10 @@ const (
gazelleBinaryName = "gazelle_binary"
)

var gazellePath = mustFindGazelle()
var gazellePath string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Since this is a global variable, it may be better to set it in TestMain or as it was done previously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var gazellePath = mustFindGazelle() will make go test . failed. Although this will not affect bazel test.

I simply modified it, and no longer use global variables. There is no need for global variables here.

gazelle/def.bzl Outdated
GAZELLE_PYTHON_RUNTIME_DEPS = [
]

non_module_deps = modules.as_extension(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move this into gazelle/python/private:extensions.bzl? please? The modules.as_extension is something that is not our external API within the gazelle plugin, so it should not be a part of def.bzl.

@hunshcn hunshcn force-pushed the feat/go-helper branch 2 times, most recently from e0cf1d6 to 78aec8f Compare May 17, 2024 03:57
@linzhp
Copy link
Contributor

linzhp commented May 19, 2024

This works in Uber

Copy link
Collaborator

@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you so much for this contribution!

@aignas aignas added this pull request to the merge queue May 20, 2024
Merged via the queue into bazelbuild:main with commit 7fc7962 May 20, 2024
4 checks passed
aignas added a commit to aignas/rules_python that referenced this pull request May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants