Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile times scale with number of templates #826

Open
wrapperup opened this issue Jun 20, 2023 · 6 comments
Open

Compile times scale with number of templates #826

wrapperup opened this issue Jun 20, 2023 · 6 comments

Comments

@wrapperup
Copy link
Contributor

wrapperup commented Jun 20, 2023

Hi, I'm enjoying askama, it's fantastic. I noticed that compile times for templates grow considerably if you have many templates, which I don't think is too surprising considering it's a proc-macro. The main issue is when making incremental changes, askama will recompile all templates, even if they didn't change. This hurts iteration times a bit.

I thought of having templates cached in the user's project (under a .askama directory) without using a build script. But then I saw #689, but it doesn't seem to have gone anywhere. Has there been any plans or additional thoughts on that to continue on this work?

IMO, it would be great if it could be done purely in the proc-macro. In my limited testing, just skipping the compile step already sped things up considerably, so it seems like the overhead of invoking and parsing a proc-macro wasn't big enough to notice (testing with about 100 templates). I saw there was some unstable Rust features for tracking files in proc-macros, which might be useful for this, but who knows how long that would take to stabilize.

@djc
Copy link
Owner

djc commented Jun 21, 2023

The main issue is when making incremental changes, askama will recompile all templates, even if they didn't change. This hurts iteration times a bit.

I find this surprising. Have you verified this? Do you understand why this is the case?

The work in #689 hasn't gone anywhere as far as I know. I still think the code generation at test time would be the optimal approach, but if you have concrete benchmark numbers to demonstrate the benefits of a caching approach I'd be open to considering that as well.

@wrapperup
Copy link
Contributor Author

wrapperup commented Jun 21, 2023

So, my test implements 100 templates in a single module, on Windows 10. Also tried on WSL, and I got similar results.

#[derive(Template)]
#[template(path = "index_001.html")]
struct Index001 {
   test: String,
}

...

#[derive(Template)]
#[template(path = "index_100.html")]
struct Index100 {
   test: String,
}

Making a change to just one these templates causes the compile time to be about ~3.10s.
I ran self profile to get a flamegraph, and curiously the proc-macro stage was the longest.

image

Also in the vanilla case, I stubbed out all of the templates, except for one, and the build time dropped to ~0.40s.

I also stubbed out include_bytes! to see if maybe there was some side-effects with file tracking that may have caused all the macros to run again, but that didn't seem to change anything. (I got similar results to above)

I made a very quick-and-dirty prototype to cache the compiled template (hashes entire proc-macro AST, but not template itself), and the build time dropped to ~0.43s for one template, and about ~1.78s for compiling all the templates.

Flamegraph for one template change with the cache (much better!):
image

and all:
image

Honestly, I have no idea why its faster when compiling all templates with cache enabled, that part doesn't really make sense to me. If you're curious, I put the implementation prototype here. It's definitely a draft though: https://github.com/wrapperup/askama/tree/cache-templates

And of course, this wasn't a super scientific test, but I think this should be good enough.

@djc
Copy link
Owner

djc commented Jun 22, 2023

Okay, so in a crate with 100 templates what is the time for running cargo check after touching 1 template in the with cache vs the without cache case?

@wrapperup
Copy link
Contributor Author

wrapperup commented Jun 22, 2023

I get similar results. Without cache, changing 1 template takes ~2.80s. With cache, it takes ~0.20s for 1, and ~2.82s for all

Here's my test repo: https://github.com/wrapperup/askama-macro-benchmark

@djc
Copy link
Owner

djc commented Jun 22, 2023

Okay, let's have a PR for a cache. Can you somehow include the template contents in the freshness calculation?

@wrapperup
Copy link
Contributor Author

Yep! This needs a bit of cleanup, but I can probably have something soonish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants