Ensure we only rebuild indexes when necessary #265

SteveMarshall · 2024-03-26T17:31:22Z

Previously, we were reindexing on every container rebuild, regardless of whether the mission data or backend code had changed.

Given these are the only things that actually impact indexing, though, we can copy just those to the container, reindex, then copy everything else, meaning that in the majority of cases, container builds are much faster because they're not spending 3-4 minutes reindexing.

To make this work, I've made the Makefile more accurately describe the dependency relationship between stats graphs (at least, the 0th graph, which every mission will have) and transcript files. That way, the later call to make collectstatic doesn't try to re-run the previously always-run make statsporn.

This also enables single-mission stats generation, and clarifies some instructions in the README around reindexing (and some other minor bits and bobs).

jaylett · 2024-03-28T19:05:55Z

Breaking the automatic creation of stats for collectstatic is absolutely fine, but I wonder if a better way is to rely less on make for docker builds and drive those Django commands directly. I assume (but may be wrong) that the preferred way (if not now then in theory in future) of doing dev is via docker, so make stops being a sensible choice fairly quickly. Prior to this PR the semantics of collectstatic are "create all static files as necessary then process them into the right place for them to serve correctly". After this PR it's "create some static files but not others then process all of them into the right place for them to server correctly".

My instinct is to break the productioncss > collectstatic link (and introduce a new allstatic target in make if that's necessary). In future the trivial make targets could be elided in the dockerfile, but that would leave us in a semantically clean space.

OR don't do that and change the commit message to state that it's fine but slightly confusing but we can't eliminate use of make because of the current complexity of CSS building.

SteveMarshall · 2024-03-30T22:40:00Z

@jaylett The more I think about it, the more I think the correct solution here is for me to not be quite so lazy, and to make the Makefile more accurately reflect the dependencies inherent in the graph generation, so I've done that.

To make that work well, though, I've had to remove the mission name from the graph files (which appear to date back to the fort, and are the only images with that namespacing, and seem to work fine without) and make it possible to build a single mission's graphs.

It doesn't seem like Docker's COPY operations properly maintain file timestamps, so the images are always regenerated when building the container (no change from before) but, importantly, don't get regenerated when we call collectstatic.

To your wider point: I'm already using Docker as the default way to develop (and updated the readme to suggest that's the default, too), but am of the opinion that the Dockerfile should describe the environment setup, and the Makefile should describe what we do inside that environment. That way, both stay relatively focussed and easy(ish) to follow. Were it not for the significant slowdown caused by reindexing and rebuilding stats graphs, I probably wouldn't do the splitting-up of make all that I'm doing in this PR, but I think this is a case where it makes sense to make use of the Docker cache.

(I've pushed a fixup to update the commit you already reviewed, so will update the original message to include more of this detail if you agree this all makes sense.)

Let me know what you think!

jaylett · 2024-03-31T17:42:14Z

Yes, that looks better to me than what I was suggesting. Does there need to be a note somewhere about what will Just Work and what Required Commands when developing? Since some of it will be baked into the container, I assume that sometimes you need to stop and rebuild, depending on what you're editing?

Previously, we were reindexing on every container rebuild, regardless of whether the mission data or backend code had changed. Given these are the only things that actually impact indexing, though, we can copy just those to the container, reindex, then copy everything else, meaning that in the majority of cases, container builds are much faster because they're not spending 3-4 minutes reindexing. To make this work, I've made the Makefile more accurately describe the dependency relationship between stats graphs (at least, the 0th graph, which every mission will have) and transcript files. That way, the later call to `make collectstatic` doesn't try to re-run the previously always-run `make statsporn`. Because of a limitation in GNU Make's patterns (pattern targets can only have one `%`), we have to remove the mission ID from the graph files. This mission ID doesn't appear to serve any purpose (and are only used on graph images), and dates back to the fort. There's a slightly weird behaviour in Docker's `COPY` that means it doesn't appear to properly maintain file timestamps when building containers, so the images are always regenerated when building the container (no change from before), but CI builds always start clean, so that's only a slight issue locally (where we might have already built the images).

This allows us to pass a list of mission names for generation. It could probably be smarter, de-duping and so on, but we don't really need that.

I missed a few things when updating the README to migrate us to a Docker-first world. Specifically, the guidance on reindexing mission content wasn't clear enough on what to do. This also fixes a couple of typos, and updates the deployment guidance now we have continuous deployment to Fly set up.

SteveMarshall · 2024-04-01T09:19:30Z

Most of that is actually documented already, but I've noticed a few inconsistencies in the README, so have gone through with a fine-toothed comb to update the guidance on reindexing.

I've also rebased to integrate the fixup, and updated the commit message for the initial commit.

SteveMarshall · 2024-04-01T10:16:51Z

I've just noticed, however, that the current build process (even prior to these changes) results in non-existent graphs in the dev environment unless you've already built them another way (or run the build process in-container with the external filesystem mounted). I'll work out a fix for that separately (as with getting search working in dev).

jaylett marked this pull request as ready for review March 28, 2024 19:00

jaylett marked this pull request as draft March 28, 2024 19:06

Base automatically changed from upgrade-to-python-3 to main March 30, 2024 13:55

SteveMarshall marked this pull request as ready for review March 30, 2024 18:35

SteveMarshall marked this pull request as draft March 30, 2024 18:36

SteveMarshall force-pushed the optimise-docker-build branch from a5c993e to 04c61f2 Compare March 30, 2024 18:40

SteveMarshall force-pushed the optimise-docker-build branch from 657eeae to 698273c Compare April 1, 2024 09:09

SteveMarshall added 3 commits April 1, 2024 10:10

Enable single mission stats graph generation

4ac2bba

This allows us to pass a list of mission names for generation. It could probably be smarter, de-duping and so on, but we don't really need that.

SteveMarshall force-pushed the optimise-docker-build branch from 698273c to 70806e7 Compare April 1, 2024 09:19

SteveMarshall marked this pull request as ready for review April 1, 2024 09:19

SteveMarshall requested a review from jaylett April 18, 2024 11:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure we only rebuild indexes when necessary #265

Ensure we only rebuild indexes when necessary #265

SteveMarshall commented Mar 26, 2024 •

edited

jaylett commented Mar 28, 2024

SteveMarshall commented Mar 30, 2024 •

edited

jaylett commented Mar 31, 2024

SteveMarshall commented Apr 1, 2024 •

edited

SteveMarshall commented Apr 1, 2024 •

edited

Ensure we only rebuild indexes when necessary #265

Are you sure you want to change the base?

Ensure we only rebuild indexes when necessary #265

Conversation

SteveMarshall commented Mar 26, 2024 • edited

jaylett commented Mar 28, 2024

SteveMarshall commented Mar 30, 2024 • edited

jaylett commented Mar 31, 2024

SteveMarshall commented Apr 1, 2024 • edited

SteveMarshall commented Apr 1, 2024 • edited

SteveMarshall commented Mar 26, 2024 •

edited

SteveMarshall commented Mar 30, 2024 •

edited

SteveMarshall commented Apr 1, 2024 •

edited

SteveMarshall commented Apr 1, 2024 •

edited