Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: Do not clear all output-response-caches once any object have been saved. #785

Open
jhormel opened this issue Aug 10, 2023 · 3 comments

Comments

@jhormel
Copy link

jhormel commented Aug 10, 2023

Improvement description

The following suggestion uses the Data Hub config to enable output-response-cache:

pimcore_data_hub:
    graphql:
        output_cache_enabled: true
        output_cache_lifetime: 172800

Current state and behavior:

Short:
As the title suggests - at the moment, Data Hub clears (actually ignores and updates) caches once any object in the backend have been saved.


Detailed:

  1. Lets say we have two totally seperate DataObjects, News and User.
  2. Via DataHub / GraphQL we can query for them with getNewsListing or getUserListing (requires regarding configuration ofc)
  3. After the first query, either on News or User, the response have been cached and any new request will be served from the cache (much better response time).
  4. If we go now to the backend and click "save" on a User, all response-caches for News are invalid as well and the response-cache will be ignored for any query.

I know, at some point we have to invalidate the cache but with this behavior cache becomes very quickly obsolete once we talk about mutations via Data Hub / GraphQL.
A mutation has bascially the same effect as clicking save in the backend - the entire response-cache becomes outdated.


Technical details:
We found that the cache invalidation happens due the version-tag mechanic of a stored response-cache. This version-tag-cache consist out of 4 parts, output, datahub api (endpoint name) and a hash which represents the query/payload in some way. I am not 100% sure about last part yet but its not about that piece anyway.
E.g response-cache-version-tags:

[tags75f07179f7be914028e416b7172702a2] => Array (
    [output] => 445607507436644576
    [datahub] => -8982459239080682620
    [api] => -8982459239080682620
    [75f07179f7be914028e416b7172702a2] => -1294679556124021229
)

Further, the database keeps the latest integer in a single entry/field to know if version-tags are outdated or not, which looks decoded like this (lets call this cache-identifier):

item_id => WgpniAsIXb:output tags
item_data => i:445607507436644576;

Note, that the values of output match 445607507436644576 => valid cache!

What now happens, once we click save on any DataObject or run a mutation, it deletes the cache-identifier and ALL response-caches must be generated again.


So, this could be both - bug or improvement. Since I dont know what the initial definition of this feature was, I'd please someone who knows to label this issue correctly.

Last but not least - are there any possibilities to fix that / workaround that?
In our scenario we have a lot mutations coming in over the frontend which - as said above - basically makes the cache obsolete since it gets barely used.

Thanks in advance!


Tested and verified in:
pimcore/data-hub 1.6.2
pimcore/data-hub 1.0.11

@jhormel
Copy link
Author

jhormel commented Sep 21, 2023

Ping :)

Anybody has read this?

In first place it would be interesting whether this is really an "improvement" or actually a "bug".

Copy link

Thanks a lot for reporting the issue. We did not consider the issue as "Pimcore:Priority", "Pimcore:ToDo" or "Pimcore:Backlog", so we're not going to work on that anytime soon. Please create a pull request to fix the issue if this is a bug report. We'll then review it as quickly as possible. If you're interested in contributing a feature, please contact us first here before creating a pull request. We'll then decide whether we'd accept it or not. Thanks for your understanding.

@fashxp
Copy link
Member

fashxp commented Feb 20, 2024

I would label it more as an possible improvement.
The thing is, that the responses are tagged with output - which is cleared on every element save in Pimcore (mainly used for Full Page Cache).

The reason is, that cache needs to be invalidated as soon as data gets updated. I guess, tagging every response with all related element tags (e.g. object_123) becomes somewhat quite complex, as in the response also related data (data from a relation to other data objects, assets, etc.) might be included.

What could be an approach is avoiding to clear cache as soon as cache lifetime is set. This would match then also behavior of Full Page Cache).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants