Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryCache: Shared query cache for all request going through runRequest #86957

Closed
wants to merge 1 commit into from

Conversation

torkelo
Copy link
Member

@torkelo torkelo commented Apr 26, 2024

It would be nice if Grafana did not re-query the same data so much, scenes partly fix this but even there it can be a bit cumbersome as it relies on caching the full scene tree.

But caching queries is a pretty tricky problem (and could come with perf costs as well)

Cache key parts

  • Queries json model JSON stringified
    • Sadly we would also have to interpolate the full query json string for variables (Which can be costly and would happen again in the data source implementation)
  • Time range
  • Data source uid
  • Adhoc filters + group by

Problems / challanges

  • For absolute time ranges, a user can still hit the refresh button to issue a new query. But nothing in the above cache key would change
  • Don't cache partial or streaming responses

Many requests are not going through runRequest

While all metric queries and most variable queries go through runRequest all data source meta & series lookup queries. Prometheus / Loki issue many of these, sometimes for each query. It has it's own meta cache to limit requesting same data but but still many duplicated requests.

Anyone, this is just 10min test. Let me know if anyone is interested in this, thinks it's worth pursuing more or have ideas for how to approach it differently.

@@ -159,6 +169,7 @@ export function runRequest(
request.endTime = Date.now();

state = processResponsePacket(packet, state);
cache.set(cacheKey, state.panelData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is set after getting the result, so if there are multiple requests initiated before the cache is set they will be executed (e.g. on initial render). Another way to apporach it could be shortly debouncing runRequest per hash. Either way, as you said it may come with some perf cost so maybe it'd make sense if it was optional and enabled on case by case basis.

To cache interpolated queries maybe the cache could live closer to the .query() method. There's already queryCachingTTL for backend caching but it's only used in Enterprise.

My gut feeling is that It seems it could be a nice thing to have in general if was opt-in rather than default 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd advise some caution when thinking of moving it closer to the query method unless there is a way to opt-out of caching – some teams (including ours, the Alerting team) have already implemented RTK Query in parts of the code base but still call the query() function in queryFn. A double layer of cache would be ... well you know what they say about cache invalidation :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ifrost yea, we could implement a query cache concept similar to react-query to handle that (So multiple queries with same key does not cause multiple real queries).

I originally tried to use the react-query cache but does not look possible.

The point with a cache here is to have a shared cache across different apps / parts of Grafana.

@@ -159,6 +169,7 @@ export function runRequest(
request.endTime = Date.now();

state = processResponsePacket(packet, state);
cache.set(cacheKey, state.panelData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we would go with that, we should only cache successful data, i.e. no error responses.

@torkelo torkelo closed this May 8, 2024
@grafana-delivery-bot grafana-delivery-bot bot removed this from the 11.1.x milestone May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants