New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add request size metric #1886
Add request size metric #1886
Conversation
internal object MeasurementUtils { | ||
inline fun <reified T> serializeAndMeasureSize(obj: T): Int { | ||
return ByteArrayOutputStream().use { outputStream -> | ||
mapper.writeValue(outputStream, obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This serializes the whole thing just for the metric. Since JSON serialization is quite expensive, I'm expecting this to impact performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps capturing the size of the request and response body in a ServletFilter might save on the overhead, since it's serialized already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a ServletFilter would be a more appropriate place. There's also some complexity because you can only read the body once in a filter. I'm not completely sure what the best way to get the size there is.
Either way, something like that needs to be done so that body is serialized/deserialized more than once. Extra serialization/deserialization won't be acceptable from a CPU perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback, I will investigate the ServletFilter approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging for a little bit, Spring has a ContentCachingRequestWrapper
and ContentCachingResponseWrapper
that takes care of the "can only read/write once" issue.
A filter like this seems to work:
@Component
public class SizeFilter implements Filter {
@Override
public void init(FilterConfig filterConfig) throws ServletException {
Filter.super.init(filterConfig);
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
if (request instanceof HttpServletRequest servletRequest) {
var wrappedRequest = new ContentCachingRequestWrapper(servletRequest);
var wrappedResponse = new ContentCachingResponseWrapper((HttpServletResponse) response);
System.out.println("Request content length: " + wrappedRequest.getContentLength());
chain.doFilter(wrappedRequest, wrappedResponse);
System.out.println("Response content length: " + wrappedResponse.getContentSize());
wrappedResponse.copyBodyToResponse();
} else {
chain.doFilter(request, response);
}
}
@Override
public void destroy() {
Filter.super.destroy();
}
}
There will still be some overhead because it needs to copy bytes around, but it will be way less expensive than JSON (de)serialization.
This sits at the request level, so it's not GraphQL specific at all. It probably doesn't need to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, if the query is available as a string, then that's all we need! assuming variables are also taken care of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah thinking about it more, I don't think it's needed to serialize it, I thought serializing it would more accurately reflect the size but it might not be worth the tradeoff plus yeah, the metric is likely needed more to get an idea of the size, not the exact size and I'll double check with the user who requested the metric
For the variables, in the approach where we measure the size in the instrumentation code, the type of that and other components types:
- variables (
Map<String, Object>
) - result data (
Map<String, *>
) - extensions (
Map<Object, Object>
) - errors (
List<GraphQLError>
)
So they are not available as strings like the input query is, I'll look into the object types as that might be okay, if not then I don't think this has to be graphQL specific necessarily and the approach with measuring at the request/response level would work better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a request body, one can just check the Content-Length
header. For the response, I think it should be opt-in regardless due to the overhead of performing extra serialization.
See the related micrometer issue: micrometer-metrics/micrometer#880
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just add the metric for the request using the content-length as part of instrumentation for now. The response body metric is proving to be costly so I doubt it's a useful metric to have anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kilink , yes, to get operationName for the metric tag, it seems like the request body would need to be deserialized in the filter. It would be cool if operationName was a expected header for graphql clients to provide, then, with the filter approach, getting the response size would be more efficient. Thanks for the suggestion of content length!
@srinivasankavitha yes, for now I will add the request size as that is a valuable metric to have and with content-length we can get the size efficiently in the instrumentation code, thanks for the feedback!
2c7561c
to
ce750f0
Compare
.containsAll( | ||
Tags.of("gql.operation", "QUERY") | ||
.and("gql.operation.name", "my_op_1") | ||
.and("gql.query.complexity", "5") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need tags like query.complexity on this metric?
state: MetricsInstrumentationState | ||
) { | ||
// Capture size using Content-Length header. | ||
val requestSizeBytes = DgsContext.from(executionContext.graphQLContext).requestData?.headers?.contentLength |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much better from a performance perspective :)
Thank you for the feedback, an internal implementation of these metrics is being explored first, closed out this PR. |
Pull request checklist
first
Pull Request type
Changes in this PR
Added micrometer DistributionSummmary metric for graphQL request size.
This metric is useful in measuring the request size for a graphql operation.
The request size measurement is the size in bytes of the request body, obtained via the Content-Length header in the graphql request. This avoids needing to de/serialize data. A graphql request body contains: query, variables, and operationname.
Via this approach, the Content-Type header for the response is not available at this point in the request processing. This change is scoped to the request size metric, as it is a metric that is useful in correlating graphql request sizes to graphql operations.
Tested via smoke tests and manual testing on ExampleApp.
Side note: An alternative approach considered was adding a web filter which would allow for more efficient measurement of both the request and response sizes but it is not standardized for graphql clients to send operation-name as a header. Presently getting the operation-name in the filter would require deserializing the http request body to get the "operationName" field.