Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache PUB internal calls #213

Open
iszulcdeepsense opened this issue Mar 17, 2023 · 2 comments
Open

Cache PUB internal calls #213

iszulcdeepsense opened this issue Mar 17, 2023 · 2 comments
Labels
good first issue Good for newcomers performance Performance improvement

Comments

@iszulcdeepsense
Copy link
Collaborator

On each Job call, PUB component makes a request to Lifecycle to check if a user is allowed to speak to a job and to get the internal address of the job to forward the request to. Lifecycle, in turn, makes calls to a database, which can perform bad under higher load.

To improve PUB's performance, these internal calls could be cached with a short time-to-live (eg. 1 minute).
The composite key of this cache composes of: user auth token, job name, job version, job endpoint.

It can be cached either by PUB or by Lifecycle, though it makes more sense to cache it by PUB.

@iszulcdeepsense iszulcdeepsense added the performance Performance improvement label Mar 17, 2023
@JosefAssadERST
Copy link
Member

which can perform bad under higher load

Has this actually been the case?

I'm not against caching at all, but I typically prefer to try as many other things as possible before. Off the top of my head:

  • Consolidating DB calls (e.g. if we always make the same two calls in succession, group them in one)
  • Checking if the connection pooler is doing the right thing keeping the connection "warm" so we reduce connection overhead
  • Measuring how much of the performance budget on a DB connection is in fact unavoidable infrastructure latency (VPN, network latency, other encryption such as TLS)

When we start caching, we will pay a price in more difficult troubleshooting later, and easier to make mistakes which can cause bugs or security issues (e.g. anything touching user accounts will need to remember to update the cache also).

@iszulcdeepsense
Copy link
Collaborator Author

Has this actually been the case?

To be honest, it's just my conjecture. It's been proven that a single request to Racetrack v2 can be slower than v1 due to the external database.
This idea is just a trick to mitigate this, but I don't think it's a problem right now.

At this point, let's leave it as it is. This issue can be even classified as a "premature optimization". Plus, I agree, this could make troubleshooting and testing more difficult.

@iszulcdeepsense iszulcdeepsense added the good first issue Good for newcomers label Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers performance Performance improvement
Projects
None yet
Development

No branches or pull requests

2 participants