Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move workspace and grader hosts into apps directory #7572

Merged
merged 30 commits into from Apr 27, 2023

Conversation

nwalters512
Copy link
Contributor

@nwalters512 nwalters512 commented Apr 25, 2023

This PR introduces an apps/ directory. It will function similarly to the existing packages/ directory in the sense that its contents will be treated as Yarn workspaces and will generally have a build script that will be run automatically; it will just contain "apps" (distinct runnable services) instead of sharable packages.

The workspace_host and grader_host directories have been renamed to apps/workspace-host and apps/grader-host, respectively. I intend to do the same with the PrairieLearn source code too in a separate PR.

This work will unlock our ability to use TypeScript and other built assets in apps.

Dockerfile Outdated
Comment on lines 8 to 11
# Note that we also have to copy the `apps` and `packages` directories so that
# `yarn` can resolve the workspaces inside it and set up symlinks correctly.
# This is suboptimal, as a change to any file in these directories will
# invalidate this layer's cache, but it's unavoidable.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to really suck once all of PrairieLearn is inside apps. The best solution I have for this is to individually COPY every app or package package.json file, probably with some script to enforce that we didn't forget any? I don't love this, but I also really want to take advantage of layer caching :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also can't think of any other way around it.

However, I just want to make sure that we really should be so worried about layer caching. What's the use case here? People doing local dev? Afaik, our executor image won't benefit from layer caching because we never pull updated versions?

Assuming we COPY packages/package-1/package.json, etc, then apps/app-1/package.json, I can see three common cases:

  1. All package.json files are unmodified: big win from caching.
  2. The PL package.json has a change, but all of the packages/ have not changed. Some win from caching, but I'm not sure how much of a win this actually is? PL presumably has by far the largest node_modules?
  3. One of the first COPYed packages/ has a change in package.json: almost all caches are blown away.

So I guess I'm wondering just much win we get in case (2), which I think is quite common, and how often we get case (1)? Unless people are pulling new images very frequently, I think they are likely to normally hit case (2). Certainly if they pull once a month they are guaranteed to hit (2) because of dependabot.

Am I thinking about this correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing I'm thinking about is CI times. For a typical master build that changes code but not dependencies, copying the entire apps/ directory in one go means that we'll incur the cost of downloading + building all deps and the cost of pushing new layers to Docker Hub. That may be several minutes, which isn't a ton, but I'm already struggling to keep our time from merge to deploy low, and this just makes that worse.

Docker appears to be completely unwilling to allow for reasonable glob behavior for COPY: moby/moby#15858.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the individual copy approach in f402ab6, which seems to work as expected. I added tools/validate-dockerfile.mjs to make sure we don't forget to add any packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, it inexplicably takes 2.5 minutes to do all the COPY steps on GitHub Actions?? It's essentially instant for me locally...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying the "copy all of apps/ and packages/" approach in 784ec9e, we'll see how that fares.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's marginally faster in the case of a cache miss, at the cost of incurring cache misses more often.

I think we just need to hope that Docker implements COPY --parents soon (moby/moby#35639) 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I agree this is all we can do for now. And thanks for reminding me about CI being the main use case for this! I always manage to forget about CI 😅

Makefile Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
package.json Show resolved Hide resolved
.github/dependabot.yml Show resolved Hide resolved
Comment on lines 40 to 42
# Ideally we'd use `with.cache: 'yarn'` in the `setup-node` action, but that
# doesn't support restoring from fallback cache keys in the case of an exact
# cache hit: https://github.com/actions/setup-node/issues/328.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actions/setup-node#328 - there are a few WIP PRs that address this limitation, so hopefully we can switch back to using actions/setup-node for caching soon!

@@ -97,7 +97,7 @@ pl_tmux_start_workspace_pane () {
echo "Starting workspace host..."
# A background process spams this pane, so pipe it to cat
# to prevent a shell prompt from appearing also:
make start-workspace-host | cat
make dev-workspace-host | cat
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echuber2 what're the desired semantics of this script? That is, should it start PL+workspaces in a way that's suitable for local development of PrairieLearn itself, or for a course? I'm treating it as the former, hence the usage of dev scripts here, but I'd like your opionion.

If this script is no longer useful to you, I'd be happy to resolve this question by just deleting it 😅

@nwalters512 nwalters512 marked this pull request as ready for review April 26, 2023 23:42
@nwalters512 nwalters512 merged commit 0b64023 into master Apr 27, 2023
2 checks passed
@nwalters512 nwalters512 deleted the workspace-and-grader-apps branch April 27, 2023 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants