Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding Response caching for ARAX and/or KG2 #2130

Open
edeutsch opened this issue Sep 12, 2023 · 2 comments
Open

Consider adding Response caching for ARAX and/or KG2 #2130

edeutsch opened this issue Sep 12, 2023 · 2 comments

Comments

@edeutsch
Copy link
Collaborator

consider adding Response caching.

  1. the biggest bang for the buck would be for ARAX. Less so for KG2. But KG2 could also benefit some
  2. there is already something called "ResponseCache" although it's more of a "response archive" than a "response cache"
  3. Since our code changes a lot, it seems prudent to clear the cache with every service restart, although something more clever could be contemplated
  4. Cache hits could be detected by taking the submitted query, removing the "callback" and "remote_address" which are transient, then serializing the remainder in a repeatable way, and then hashing it
  5. For ARAX, a cached Response could just point to the "response archive" and pull the Response from there, perhaps edit the "callback" and "remote_address" and send it back
  6. For KG2, responses are not current archived, so we couldn't use that. We would likely have to cache the Responses in a local cache that could be similar to the "response archive" but transient.
@saramsey
Copy link
Member

Interesting idea. So I am wondering, how frequent are such repeated queries of the same TRAPI graph, that are not "testing if ARAX is working"? If it isn't a high proportion of our non-testing-related queries, then that might limit the positive benefit. On the other hand, it may not be that hard to implement. Are you thinking of an in-memory cache? That could pose very interesting/thorny threading issues. Or perhaps you were thinking of a SQLite cache on the local EBS volume; that would presumably be much faster than retrieving from the S3 bucket. And the cache would not be cross-service-instance, right? i.e., it would just be local to the specific ARAX service?

[For the "testing if ARAX is working" use-case, I presume that achieving sub-second response times is not really a high priority; that's why I phrased my question in terms of use-cases outside of testing-for-ARAX-not-being-broken].

@edeutsch
Copy link
Collaborator Author

Interesting idea. So I am wondering, how frequent are such repeated queries of the same TRAPI graph, that are not "testing if ARAX is working"?

I don't have enough data to know.

If it isn't a high proportion of our non-testing-related queries, then that might limit the positive benefit.

True.

On the other hand, it may not be that hard to implement. Are you thinking of an in-memory cache? That could pose very interesting/thorny threading issues.

No, because of all the forking involved, in-memory would not work I think.

Or perhaps you were thinking of a SQLite cache on the local EBS volume; that would presumably be much faster than retrieving from the S3 bucket. And the cache would not be cross-service-instance, right? i.e., it would just be local to the specific ARAX service?

Exactly, yes to all.

[For the "testing if ARAX is working" use-case, I presume that achieving sub-second response times is not really a high priority; that's why I phrased my question in terms of use-cases outside of testing-for-ARAX-not-being-broken].

yes, ideally we would ensure that our "bypass_cache" option was working correctly and the watchdog would use the "bypass_cache" option, which others would not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants