Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe rendering very slow in certain cases #5224

Closed
2 tasks done
mbargull opened this issue Mar 11, 2024 · 3 comments
Closed
2 tasks done

Recipe rendering very slow in certain cases #5224

mbargull opened this issue Mar 11, 2024 · 3 comments
Labels
source::contributor created by a frequent contributor type::bug describes erroneous operation, use severity::* to classify the type
Milestone

Comments

@mbargull
Copy link
Member

Checklist

  • I added a descriptive title
  • I searched open reports and couldn't find a duplicate

What happened?

Recipe rendering can be very slow in certain cases.
Recently, I was confronted with the case of https://github.com/conda-forge/arrow-cpp-feedstock for which a conda-smithy rerender apparently takes between 10 and 30 minutes.
To put things into context:
The recipe

  1. is rendered 6 times by conda-smithy (for linux-64/linux-aarch64/linux-ppc64le/osx-64/osx-arm64/win-64),
  2. has outputs for 12 separate packages,
  3. produces 200 package builds (due to variants defined in conda-forge-pinning+.ci_support/migrations),
  4. includes 68 pin_subpackage calls via jinja.

Which resulted in roughly

  • 60'000 calls to .metadata.MetaData.get_recipe_text,
  • 50'000 calls to .jinja_context.pin_subpackage,
  • 120'000 calls to .metadata.select_lines,
  • 10'000'000 calls to .metadata.eval_selector.

(I stopped the tracing profiler at somepoint, so those numbers are extrapolated; the order of magnitude should match actual numbers, though.)

Higher call counts are to be expected due to conda-build's repeated partial parsing of the recipe.
They are additionally unproportionally increased due to some oddities like .metadata.MetaData.build_id not being cached for .jinja_context.pin_subpackage at
https://github.com/conda/conda-build/blob/24.1.2/conda_build/jinja_context.py#L406
which means each pin_subpackage_against_outputs call will re-extract the recipe text over and over again due to
https://github.com/conda/conda-build/blob/24.1.2/conda_build/metadata.py#L1660 .

Conda Info

No response

Conda Config

No response

Conda list

No response

Additional Context

No response

@mbargull mbargull added type::bug describes erroneous operation, use severity::* to classify the type source::contributor created by a frequent contributor labels Mar 11, 2024
@mbargull
Copy link
Member Author

cc @xhochy, @h-vetinari since you have the "pleasure" dealing with this issue when maintaining arrow-cpp-feedstock.

I've proposed gh-5225 to treat some of the symptoms of this issue.
It's not doing doing "much" (as in, the changes do not make the process >100 times faster as would actually fixing it), but at least we'd get around 4 times speedup for the arrow-cpp-feedstock case.

mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 20, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
mbargull added a commit to mbargull/conda-build that referenced this issue Mar 21, 2024
This is a performance regression benchmark for
conda#5224

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
kenodegard pushed a commit that referenced this issue Mar 21, 2024
* Add benchmark for high pin_subpackage count recipe

This is a performance regression benchmark for
#5224

* Reduce size of test_pin_subpackage_benchmark

* Use render(..., variants=...) not Config.variant

---------

Signed-off-by: Marcel Bargull <marcel.bargull@udo.edu>
@mbargull
Copy link
Member Author

This issue is currently being investigated.
Quick update on the recent progress:

@beeankha beeankha added this to the 24.5.x milestone Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source::contributor created by a frequent contributor type::bug describes erroneous operation, use severity::* to classify the type
Projects
Archived in project
Development

No branches or pull requests

2 participants