New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(seed): run rebuilds in parallel, add perf logs #4334
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Ignored Deployments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. It'll make CPU's heat up, but definitely worth the experiment.
for (const filename of SEARCH_PARAMETER_BUNDLE_FILES) { | ||
for (const entry of readJson(filename).entry as BundleEntry[]) { | ||
await createParameter(systemRepo, entry.resource as SearchParameter); | ||
promises.push(createParameter(systemRepo, entry.resource as SearchParameter)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, curious how Postgres handles this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're capped at 10 concurrent connections with the pool, I considered trying to up it and see what happens, but seemed to run fine on my local machine as is.
8a48b05
to
53c8e88
Compare
53c8e88
to
000a017
Compare
607e74f
to
40a0941
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this investigation.
How much time savings do we get from this?
I'd like to understand full picture. This increases the complexity of test setup (extra instance of postgres, double seed
logic, etc). I just want to be sure that complexity is justified.
Scratch that, I see, 30-40 seconds, ok, that is substantial.
Yeah, let's push forward with this 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why move this file? Strong preference for keeping all .ts
files in src/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was initially due to weird glob matching not excluding the seed tests properly from main line of tests... it was cleaner to just separate them since they are somewhat logically separate...
I put them back though, found a sort of hack around it with some **
paths that works for some reason. Made a note about it in the file
promises.push(createParameter(systemRepo, entry.resource as SearchParameter)); | ||
} | ||
} | ||
await Promise.all(promises); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider always building the array, and then using the parallel
option in the last step
if (finalOptions.parallel) {
await Promise.all(promises);
} else {
for (const promise of promises) {
await promise;
}
}
That way we don't need to worry about the logic getting out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this won't work. Filling the array in the serial case doesn't actually execute the promises serially, only waits for them serially. We have to create the promise in the loop where it is awaited in order to not unintentionally parallelize them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's worth introducing a new dependency, but p-limit seems like a pretty elegant and simple solution to abstract away the max concurrency that we want to allow and avoid the if/else
code paths.
e3b7b1b
to
8ad3add
Compare
0168a8c
to
8ef8b4b
Compare
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a couple comments to consider.
@@ -261,7 +261,7 @@ function addDefaults(config: MedplumServerConfig): MedplumServerConfig { | |||
config.accurateCountThreshold = config.accurateCountThreshold ?? 1000000; | |||
config.defaultBotRuntimeVersion = config.defaultBotRuntimeVersion ?? 'awslambda'; | |||
config.defaultProjectFeatures = config.defaultProjectFeatures ?? []; | |||
config.emailProvider = config.emailProvider || (config.smtp ? 'smtp' : 'awsses'); | |||
config.emailProvider = config.emailProvider ?? (config.smtp ? 'smtp' : 'awsses'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ||
actually be preferable here to guard against config.emailProvider === ''
?
promises.push(createParameter(systemRepo, entry.resource as SearchParameter)); | ||
} | ||
} | ||
await Promise.all(promises); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's worth introducing a new dependency, but p-limit seems like a pretty elegant and simple solution to abstract away the max concurrency that we want to allow and avoid the if/else
code paths.
So far, this has shown to reduce average startup time when seeding a fresh database on my local machine by 35-45 seconds. I expect to see a noticeable speedup in CI too.
TODO
- [x] Ensure low load on when rebuilding (test on staging)