[release/8.0] Fix duplicate error.type tags in metrics #55301
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix duplicate error.type tags in metrics
Description
ASP.NET Core has a prominent metric -
http.request.server.duration
- that reports information about HTTP request duration along with important metadata. One piece of metadata is whether an error occurred during the request. This is theerror.type
tag.The
error.type
tag can be reported from multiple places, and it's possible to get duplicate tag values in rare situations. Duplicate metrics tags can break some telemetry tooling.error.type
can be added by exception middleware and it can be added in hosting if there is an unhandled exception. That means there can be duplicates if:Fix by conditionally adding
error.type
to tag collection only if it isn't present, aka first value wins.Fixes #55159
Customer Impact
Reported by customer at #55159. Prometheus (popular OSS telemetry database) can't handle duplicates and fails to load new metrics.
There is also open-telemetry/opentelemetry-dotnet#5199 which appears to be from the same issue and has 6 +1 votes.
A workaround is to not use the ASP.NET Core error middleware, but it is a very popular feature, and it's not obvious that it is the fix.
Regression?
Risk
The fix is simple: don't add
error.type
tag to metrics collection if it is already present.Verification
Before:
After:
Packaging changes reviewed?