Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request size metric #1886

Closed
wants to merge 4 commits into from
Closed

Conversation

kailyak
Copy link
Contributor

@kailyak kailyak commented Apr 18, 2024

Pull request checklist

  • Please read our contributor guide
  • Consider creating a discussion on the discussion forum
    first
  • Make sure the PR doesn't introduce backward compatibility issues
  • Make sure to have sufficient test cases

Pull Request type

  • Bugfix
  • Feature
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Other (please describe):

Changes in this PR

Added micrometer DistributionSummmary metric for graphQL request size.

Metric Name: gql.request.size
Tags: [
	- tag(gql.operation={operationTypeValue})
	- tag(gql.operation.name={operationNameValue})
	- tag(gql.query.complexity={qComplexityValue})
	- tag(gql.query.sig.hash={sigHashValue})
	]

This metric is useful in measuring the request size for a graphql operation.
The request size measurement is the size in bytes of the request body, obtained via the Content-Length header in the graphql request. This avoids needing to de/serialize data. A graphql request body contains: query, variables, and operationname.
Via this approach, the Content-Type header for the response is not available at this point in the request processing. This change is scoped to the request size metric, as it is a metric that is useful in correlating graphql request sizes to graphql operations.

Tested via smoke tests and manual testing on ExampleApp.

Side note: An alternative approach considered was adding a web filter which would allow for more efficient measurement of both the request and response sizes but it is not standardized for graphql clients to send operation-name as a header. Presently getting the operation-name in the filter would require deserializing the http request body to get the "operationName" field.

internal object MeasurementUtils {
inline fun <reified T> serializeAndMeasureSize(obj: T): Int {
return ByteArrayOutputStream().use { outputStream ->
mapper.writeValue(outputStream, obj)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This serializes the whole thing just for the metric. Since JSON serialization is quite expensive, I'm expecting this to impact performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps capturing the size of the request and response body in a ServletFilter might save on the overhead, since it's serialized already?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a ServletFilter would be a more appropriate place. There's also some complexity because you can only read the body once in a filter. I'm not completely sure what the best way to get the size there is.

Either way, something like that needs to be done so that body is serialized/deserialized more than once. Extra serialization/deserialization won't be acceptable from a CPU perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback, I will investigate the ServletFilter approach

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging for a little bit, Spring has a ContentCachingRequestWrapper and ContentCachingResponseWrapper that takes care of the "can only read/write once" issue.

A filter like this seems to work:


@Component
public class SizeFilter implements Filter {
    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        Filter.super.init(filterConfig);
    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {

        if (request instanceof HttpServletRequest servletRequest) {
            var wrappedRequest = new ContentCachingRequestWrapper(servletRequest);
            var wrappedResponse = new ContentCachingResponseWrapper((HttpServletResponse) response);
            System.out.println("Request content length: " + wrappedRequest.getContentLength());
            chain.doFilter(wrappedRequest, wrappedResponse);

            System.out.println("Response content length: " + wrappedResponse.getContentSize());
            wrappedResponse.copyBodyToResponse();

        } else {
            chain.doFilter(request, response);
        }
    }

    @Override
    public void destroy() {
        Filter.super.destroy();
    }
}

There will still be some overhead because it needs to copy bytes around, but it will be way less expensive than JSON (de)serialization.

This sits at the request level, so it's not GraphQL specific at all. It probably doesn't need to be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, if the query is available as a string, then that's all we need! assuming variables are also taken care of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thinking about it more, I don't think it's needed to serialize it, I thought serializing it would more accurately reflect the size but it might not be worth the tradeoff plus yeah, the metric is likely needed more to get an idea of the size, not the exact size and I'll double check with the user who requested the metric

For the variables, in the approach where we measure the size in the instrumentation code, the type of that and other components types:

  • variables (Map<String, Object>)
  • result data (Map<String, *>)
  • extensions (Map<Object, Object>)
  • errors (List<GraphQLError>)

So they are not available as strings like the input query is, I'll look into the object types as that might be okay, if not then I don't think this has to be graphQL specific necessarily and the approach with measuring at the request/response level would work better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a request body, one can just check the Content-Length header. For the response, I think it should be opt-in regardless due to the overhead of performing extra serialization.

See the related micrometer issue: micrometer-metrics/micrometer#880

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just add the metric for the request using the content-length as part of instrumentation for now. The response body metric is proving to be costly so I doubt it's a useful metric to have anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kilink , yes, to get operationName for the metric tag, it seems like the request body would need to be deserialized in the filter. It would be cool if operationName was a expected header for graphql clients to provide, then, with the filter approach, getting the response size would be more efficient. Thanks for the suggestion of content length!
@srinivasankavitha yes, for now I will add the request size as that is a valuable metric to have and with content-length we can get the size efficiently in the instrumentation code, thanks for the feedback!

@kailyak kailyak force-pushed the add-request-response-size-metrics branch from 2c7561c to ce750f0 Compare April 23, 2024 18:32
@kailyak kailyak changed the title Add request response size metrics Add request size metric Apr 23, 2024
.containsAll(
Tags.of("gql.operation", "QUERY")
.and("gql.operation.name", "my_op_1")
.and("gql.query.complexity", "5")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need tags like query.complexity on this metric?

state: MetricsInstrumentationState
) {
// Capture size using Content-Length header.
val requestSizeBytes = DgsContext.from(executionContext.graphQLContext).requestData?.headers?.contentLength
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much better from a performance perspective :)

@kailyak kailyak closed this May 6, 2024
@kailyak kailyak deleted the add-request-response-size-metrics branch May 6, 2024 22:21
@kailyak
Copy link
Contributor Author

kailyak commented May 6, 2024

Thank you for the feedback, an internal implementation of these metrics is being explored first, closed out this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants