Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling override with http.response.status_code doesn't work #3659

Closed
zwilling79 opened this issue Apr 22, 2024 · 6 comments · Fixed by #3708
Closed

Sampling override with http.response.status_code doesn't work #3659

zwilling79 opened this issue Apr 22, 2024 · 6 comments · Fixed by #3708
Assignees

Comments

@zwilling79
Copy link

Expected behavior

If you add a sampling override that filters out all requests with a specific HTTP response status, those requests shouldn't be shown in Application Insights.

Actual behavior

HTTP requests with the specified status code are shown in Application Insights.

To Reproduce

  • Create a simple Spring Boot application with the health actuator endpoint enabled
  • Create a applicationinsights.json that includes the below sampling setting:
  "sampling": {
    "percentage": 100,
    "overrides": [
      {
        "telemetryType": "request",
        "attributes": [
          {
            "key": "http.response.status_code",
            "value": 200,
            "matchType": "strict"
          }
        ],
        "percentage": 0
      }
    ]
  },

System information

Please provide the following information:

  • SDK Version 3.5.1 (Telemetry SDK Version: 1.35.0)
  • OS type and version: Windows 11
  • Application Server type and version (if applicable): Tomcat
  • Using spring-boot? Yes
  • Additional relevant libraries (with version, if applicable): n/a

Logs

2024-04-22 10:28:52.795+02:00 DEBUG c.m.a.a.i.exporter.AgentSpanExporter - exporting span: SpanData{spanContext=ImmutableSpanContext{traceId=0dad507a37d7da13c3a81e9139723846, spanId=4eadcbc1fe5c4e8b, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, parentSpanContext=ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}, resource=Resource{schemaUrl=null, attributes={service.name="appinsights", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.35.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.tomcat-10.0, version=2.1.0-alpha, schemaUrl=null, attributes={}}, name=GET /actuator/health, kind=SERVER, startEpochNanos=1713774532712869100, endEpochNanos=1713774532774249100, attributes=AttributesMap{data={thread.id=65, http.request.method=GET, http.route=/actuator/health, http.response.status_code=200, network.peer.address=127.0.0.1, server.address=localhost, client.address=127.0.0.1, url.path=/actuator/health, server.port=8080, network.protocol.version=1.1, user_agent.original=Apache-HttpClient/4.5.14 (Java/17.0.10), network.peer.port=60098, url.scheme=http, thread.name=http-nio-8080-exec-4, applicationinsights.internal.is_pre_aggregated=true}, capacity=128, totalAddedValues=15}, totalAttributeCount=15, events=[], totalRecordedEvents=0, links=[], totalRecordedLinks=0, status=ImmutableStatusData{statusCode=UNSET, description=}, hasEnded=true}
2024-04-22 10:28:57.251+02:00 DEBUG c.a.m.o.e.i.p.TelemetryItemExporter - sending telemetry to ingestion service:
{"ver":1,"name":"Metric","time":"2024-04-22T08:28:57.251Z","iKey":"ec7d4b96-3d1e-405a-8d5f-0d90258b5785","tags":{"ai.internal.sdkVersion":"java:3.5.1","ai.cloud.roleInstance":"...","ai.cloud.role":"appinsights"},"data":{"baseType":"MetricData","baseData":{"ver":2,"metrics":[{"name":"_OTELRESOURCE_","value":0.0}],"properties":{"telemetry.sdk.language":"java","service.name":"appinsights","service.instance.id":"...","telemetry.sdk.version":"1.35.0","telemetry.sdk.name":"opentelemetry"}}}}
{"ver":1,"name":"Request","time":"2024-04-22T08:28:52.712Z","iKey":"ec7d4b96-3d1e-405a-8d5f-0d90258b5785","tags":{"ai.internal.sdkVersion":"java:3.5.1","ai.operation.id":"0dad507a37d7da13c3a81e9139723846","ai.cloud.roleInstance":"...","ai.operation.name":"GET /actuator/health","ai.location.ip":"127.0.0.1","ai.cloud.role":"appinsights","ai.user.userAgent":"Apache-HttpClient/4.5.14 (Java/17.0.10)"},"data":{"baseType":"RequestData","baseData":{"ver":2,"id":"4eadcbc1fe5c4e8b","name":"GET /actuator/health","duration":"00:00:00.061380","success":true,"responseCode":"200","url":"http://localhost:8080/actuator/health","properties":{"_MS.ProcessedByMetricExtractors":"True"}}}}
@zwilling79 zwilling79 changed the title Sampling override with https.respone.status_code doesn't work Sampling override with http.response.status_code doesn't work Apr 22, 2024
@heyams heyams self-assigned this Apr 22, 2024
@heyams
Copy link
Contributor

heyams commented Apr 22, 2024

@zwilling79 you can use OpenTelemetry Extension to filter telemetry based on http.reponse.status_code.
Here is an example how to filter out telemetry based on duration . You can do something similar.

@zwilling79
Copy link
Author

Hm, this may work. Nonetheless, I would prefer to have this part of the configuration file so that it can be easily adjusted, especially if it is specific to certain environments. For instance, today I just want to filter out the health checks and the prometheus endpoint requests which have a response code of 200. Tomorrow I want to filter out some additional business application endpoints that have a response code of 200. To compile/package/distribute the otel extension JAR for such changes looks a bit overkill. Furthermore, if you want to use different configurations for different environments, you have to maintain different otel extension JARs or add more complexity to read/evaluate further configuration files.

I think, the problem in the code is that the values of the sampling override attributes are always treated as strings but the actual attribute is of type integer. So it is perhaps similar to #3378.

@jeanbisutti
Copy link
Member

Only attributes set at the start of the span are available for sampling, so attributes such as http.response.status_code or request duration won't work for sampling.

Alternatively, you could try to use DCR. A tutorial: https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-workspace-transformations-portal

@potzkovge
Copy link

potzkovge commented Apr 26, 2024

Only attributes set at the start of the span are available for sampling, so attributes such as http.response.status_code or request duration won't work for sampling.

It is very confusing which attributes are available for sampling since 3.5.0. The docs point you to the "exporting span" line but that line is basically useless as it includes the http.status_code and is not printing for example url.full which i am able to use even though it is not included in the "exporting span" line. While the next line warns you that only attributes at the start of the span are available for sampling it would be great to know which attributes are available when i set my loglevel to debug.

@zwilling79
Copy link
Author

Have done exactly the same. Enabled the debug logging to see on which fields I can filter on. And because I saw http.response.status_code=200 in the attributes list, I thought I could filter on this.

@trask
Copy link
Member

trask commented May 13, 2024

we are thinking to add a warning during startup if there are sampling override attributes used which are known not to be available at span start such as http.response.status_code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants