Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC (Graphcache): Change SerializedEntries to provide Map to storage adapters #3425

Open
Mookiies opened this issue Nov 9, 2023 · 3 comments
Labels
future 🔮 An enhancement or feature proposal that will be addressed after the next release

Comments

@Mookiies
Copy link
Contributor

Mookiies commented Nov 9, 2023

Summary

Change SerializedEntries to return a Map instead of an object.

This will prevent react-native clients using Hermes from running up against 196607 properties limit and breaking caching. facebook/hermes#851

Proposed Solution

Change the type of SerializedEntries to SerializedEntries = Map<string, string | undefined>;

Downsides

  • SerializedEntries can no longer be JSON.strinigify'ed without using the replacement function
  • Breaking change for all storage adapters (including both shipped as a part of urql)
  • No possibility of ES5 support
@Mookiies Mookiies added the future 🔮 An enhancement or feature proposal that will be addressed after the next release label Nov 9, 2023
@Mookiies
Copy link
Contributor Author

Mookiies commented Nov 9, 2023

Running into this properties limit is something that is happening to us in production. If this a change y'all would be interested interested I'd be happy to own the implementation as I'll likely be making this patch regardless.

Mainly would like to know if I should bother going through the effort of updating storage adapters urql ships and making a merge-able PR.

Mookiies added a commit to Mookiies/urql that referenced this issue Nov 13, 2023
Swap SerializedEntries to pass data as a Map instead plain object.
The main motivation for this change is to support metro's low object
properties limit. facebook/hermes#851

Urql issue: urql-graphql#3425
Mookiies added a commit to Mookiies/urql that referenced this issue Nov 13, 2023
Swap SerializedEntries to pass data as a Map instead plain object.
The main motivation for this change is to support metro's low object
properties limit. facebook/hermes#851

Urql issue: urql-graphql#3425
@kitten
Copy link
Member

kitten commented Nov 15, 2023

I think the problems here we'll run into are that the format isn't detached from Graphcache of course:

export const persistData = () => {
if (currentData!.storage) {
currentOptimistic = true;
currentOperation = 'read';
const entries: SerializedEntries = {};
for (const key of currentData!.persist.keys()) {
const { entityKey, fieldKey } = deserializeKeyInfo(key);
let x: void | Link | EntityField;
if ((x = readLink(entityKey, fieldKey)) !== undefined) {
entries[key] = `:${stringifyVariables(x)}`;
} else if ((x = readRecord(entityKey, fieldKey)) !== undefined) {
entries[key] = stringifyVariables(x);
} else {
entries[key] = undefined;
}
}
currentOptimistic = false;
currentData!.storage.writeData(entries);
currentData!.persist.clear();
}
};
export const hydrateData = (
data: InMemoryData,
storage: StorageAdapter,
entries: SerializedEntries
) => {
initDataState('write', data, null);
for (const key in entries) {
const value = entries[key];
if (value !== undefined) {
const { entityKey, fieldKey } = deserializeKeyInfo(key);
if (value[0] === ':') {
if (readLink(entityKey, fieldKey) === undefined)
writeLink(entityKey, fieldKey, JSON.parse(value.slice(1)));
} else {
if (readRecord(entityKey, fieldKey) === undefined)
writeRecord(entityKey, fieldKey, JSON.parse(value));
}
}
}
data.storage = storage;
data.hydrating = false;
clearDataState();
};

For context (i.e. readers who'll be following this thread passively), this means that we'll have to make all storages incompatible and this will break any offline caches that someone may have already built-up. In short, this is definitely a big change, and a breaking change, if we consider storages to be a public interface; personally, I do, but this also depends on how many custom storages we expect people to have written, re.

export interface SerializedEntries {
[key: string]: string | undefined;
}

Currently, the problem of storage persistence runs rather deep and I'd consider the current implementation optimised only for the format it has.
In short, it:

  • buffers keys that have changed
  • splits the keys back to their entity + field representation
  • reads the value from the above entry (which is necessary to get the "ground truth" i.e. read all layers)
  • marshalls the value and writes it to a buffer
  • sends the buffer off to storage.writeData

The reason this is done this way (besides reading from the layers in the "ground truth" order), is to not make writing more expensive than it is. We don't want the write performance to be impacted, which is why a buffer is created rather than "streaming" the changed values into a buffer/storage.
This also has the benefit of the process being deferrable.
I'd hate to lose this and while it's tempting to make this an iterable, it's probably not going to be performant, and we don't have benchmarks for this yet. However, any iterable-like could avoid key splitting and hence avoid an intermediary format.

The alternative of maps is interesting, but indeed causes serialisation issues. Hence, it doesn't actually solve the problem but pushes it onwards to storages. This increases complexity for storages.

So, the format per se isn't necessarily the problem:

  • The problem is that hydrateData is an atomic operation that also doesn't operate on chunked iterables, which also would have benefits for persistData
  • We're currently only at risk of hitting this limit if the persist "chunk" is too big, or because hydrateData isn't chunked

So, this does open up more options and more aggressive changes than just switching to Maps. Basically, if we can make the entire thing chunked by default and evaluate how to reduce some of the key and marshalling complexity then we may even get an even better interface here

@Mookiies
Copy link
Contributor Author

Firstly, thank you for the detailed response and all the work on this library ❤️

I did get Graphcahce working with Maps, Mookiies@fcb9d11. However, I agree that simply switching to Maps wouldn't be a good API change for the library. It's a huge breaking change and there were plenty of footguns in Map serialization for the StorageAdapter. I didn't do any performance analysis. While it does solve my immediate issue of Hermes's low object properties limit, that's really a Hermes issue not one with Urql.

Coming up with an interface that would "chunk" by default would be awesome, especially if it can provide Graphcahce with performance benefits without significantly increasing complexity on StorageAdapters. Have these issues around hydrateData and persistData only operating on large chunks ever come up in a different context as-well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
future 🔮 An enhancement or feature proposal that will be addressed after the next release
Projects
None yet
Development

No branches or pull requests

2 participants