Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance by disabling return/resolved values validation #1493

Open
Warxcell opened this issue Dec 7, 2023 · 12 comments
Open

Improve performance by disabling return/resolved values validation #1493

Warxcell opened this issue Dec 7, 2023 · 12 comments

Comments

@Warxcell
Copy link
Contributor

Warxcell commented Dec 7, 2023

What do you think about this one? The use-case is to gain a little bit of performance, by disabling resolver's returned value validation - and just trusting it.

No idea how much will gain - just throwing it in the wild and see what people think. :)

@spawnia
Copy link
Collaborator

spawnia commented Dec 11, 2023

My instinct says that the additional check for whichever setting controls this behaviour might eat up some of the benefits afforded by it. Making this as fast as possible would require having a code path that does no checks at all, but also does not checks if that behaviour is enabled. This would require duplicating/rewriting substantial parts of the executor, as well as possibly swapping the default field resolver and other parts that have sanity/safety checks.

Feel free to explore this in a pull request, along with benchmarks to test the effectiveness. You can also use a custom executor in your own project, see

public static function setImplementationFactory(callable $implementationFactory): void
. I don't think this is going to be a worthwhile endeavours and am thus closing this issue, but I am also open to being convinced otherwise with working code and the benchmarks to prove it.

@spawnia spawnia closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2023
@mfn
Copy link
Contributor

mfn commented Dec 13, 2023

FTR I would be interested in this.

Once payloads returned exceed a significant size (in the multi-mega-byte range), I've seen that when profiling (used xdebug & PhpStorm xdebug snapshot analyzer) the micro-overhead for a single field becomes measurable in the big picture to the point it starts to rival the slowest part in those requests, usually the database.

@spawnia
Copy link
Collaborator

spawnia commented Dec 13, 2023

FTR?

@mfn
Copy link
Contributor

mfn commented Dec 13, 2023

For The Record

@Warxcell
Copy link
Contributor Author

Warxcell commented Dec 15, 2023

btw https://github.com/zalando-incubator/graphql-jit have thing called

disableLeafSerialization {boolean, default: false} - disables leaf node serializers. The serializers validate the content of the field at runtime so this option should only be set to true if there are strong assurances that the values are valid.

I will try to make test in few days to see what's the gain.

@mfn can you point me where most execution time comes from?

@spawnia spawnia reopened this Dec 15, 2023
@spawnia spawnia changed the title Feature request: disable return/resolved values validation Improve performance by disabling return/resolved values validation Dec 15, 2023
@mfn
Copy link
Contributor

mfn commented Dec 15, 2023

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful:
image

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

@Warxcell
Copy link
Contributor Author

@spawnia another thing that could improve performance is skipping

            $validationErrors = DocumentValidator::validate($schema, $documentNode, $validationRules);

            if ($validationErrors !== []) {
                return $promiseAdapter->createFulfilled(
                    new ExecutionResult(null, $validationErrors)
                );
            }

in case if document is cached. If document is cached - it was already validated somewhere in the past.

@Warxcell
Copy link
Contributor Author

Warxcell commented Dec 15, 2023

before
image

after:
image

basically 100-200ms improvement on one specific query on our side.

 $cacheItem = $this->queryCache->getItem(md5($op->query));

            if ($cacheItem->isHit()) {
                $documentNode = AST::fromArray($cacheItem->get());
            } else {
                $documentNode = Parser::parse($op->query);

                $queryComplexity = DocumentValidator::getRule(QueryComplexity::class);
                assert(
                    $queryComplexity instanceof QueryComplexity,
                    'should not register a different rule for QueryComplexity'
                );

                $queryComplexity->setRawVariableValues($op->variables);


                $validationErrors = DocumentValidator::validate($this->schema, $documentNode);

                if ($validationErrors !== []) {
                    return $promiseAdapter->createFulfilled(
                        new ExecutionResult(null, $validationErrors)
                    );
                } else {
                    $cacheItem->set(AST::toArray($documentNode));
                    $this->queryCache->save($cacheItem);
                }
            }

But i'm not sure how to incorporate this into here. I basically copied StandardServer, Helper and Grapql::promiseToExecute into my bundle in order to achieve that.

https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L231-L257

Any advice how to proceed with this one?

@Warxcell
Copy link
Contributor Author

Warxcell commented Dec 15, 2023

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful: image

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

I've tested with blackfire, and from 300ms total response time, we have 80ms lost in is_iterable checks (173 times called).

image

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

@SebastienTainon
Copy link

SebastienTainon commented Mar 19, 2024

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

@Warxcell
Copy link
Contributor Author

Warxcell commented Mar 19, 2024

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

you can achieve same results by copy-paste-modify StandardServer.php

You can see what I've done here: https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php ( think https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L254-L280 is the most important part - it skips parsing && validation of query if its cached)

The question is - Can we incorporate that directly into this package?

@spawnia
Copy link
Collaborator

spawnia commented Mar 20, 2024

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

I don't want to change the default behaviour of this library in production settings. I very much appreciate how it prevents bad data from ever reaching clients, since we can be fearless and not use defensive programming there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants