Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing: Spans are not in the right order and do not respect the naming convention #43645

Open
joegoldman2 opened this issue Apr 25, 2024 · 3 comments
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team OpenTelemetry OpenTelemetry instrumentation (not Monitor-specific) question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus

Comments

@joegoldman2
Copy link

Library name

Azure.Messaging.ServiceBus

Please describe the feature.

I have an application that uses the Azure Service Bus SDK through MassTransit.
Today, I'm sending the telemetry to Application Insights using the OpenTelemetry collector and the Azure Monitor exporter. I find that the result is cleaner when I subscribe to MT activity source rather than the Azure one:

With TracerProviderBuilder.AddSource("MassTransit"):

MT

With TracerProviderBuilder.AddSource("Azure.*"):

Azure

Two issues with Azure activity source:

  • The send and process activities are at the same level (same parent id). I expect the process activity to be under the send one. Probably because the Azure SDK is setting the traceparent in the message's headers before starting the send activity (so the traceparent corresponds to the request POST /orders).
  • The name of the spans created by MT are following the spec: <destination name> <operation name> and are easier to understand.

cc @lmolkova

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Monitor - Exporter Monitor OpenTelemetry Exporter needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. labels Apr 25, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @cijothomas @rajkumar-rangaraj @reyang @TimothyMothra @vishweshbankwar.

@cijothomas cijothomas removed the Monitor - Exporter Monitor OpenTelemetry Exporter label Apr 25, 2024
@lmolkova
Copy link
Member

lmolkova commented Apr 25, 2024

Thanks for reporting this @joegoldman2

The send and process activities are at the same level (same parent id). I expect the process activity to be under the send one. Probably because the Azure SDK is setting the traceparent in the message's headers before starting the send activity (so the traceparent corresponds to the request POST /orders).

ServiceBus SDK creates the following spans

  • Message span (create operation in OTel spec). Its context goes on the wire
    • Process span is linked to it and also becomes a child of it.
  • Send span is linked to a message, but it's usually a message sibling and an uncle of process.

this behavior aligns with OTel spec, and we will probably polish it further (#43355)

Could you please check the data (parent ids) and check if it's the case. If you see that based on parent-id "send and process activities are at the same level (same parent id)", it's definitely an issue.

The name of the spans created by MT are following the spec: and are easier to understand.

You're right. We'll fix it.

@lmolkova lmolkova added Service Bus OpenTelemetry OpenTelemetry instrumentation (not Monitor-specific) and removed Service Attention This issue is responsible by Azure service team. labels Apr 25, 2024
@github-actions github-actions bot added the needs-team-triage This issue needs the team to triage. label Apr 25, 2024
@jsquire jsquire removed the needs-team-triage This issue needs the team to triage. label Apr 25, 2024
@joegoldman2
Copy link
Author

Could you please check the data (parent ids) and check if it's the case. If you see that based on parent-id "send and process activities are at the same level (same parent id)", it's definitely an issue.

It seems that send and process activities have the correct ids. I think what makes the result strange compared to MassTransit is the additional activity called Message, which has the same id as the send activity.

I exported the data using the debug exporter of the OTel collector:

otel-collector  | ScopeSpans #0
otel-collector  | ScopeSpans SchemaURL:
otel-collector  | InstrumentationScope Azure.Messaging.ServiceBus.Message
otel-collector  | Span #0
otel-collector  |     Trace ID       : bdac052a452b23d96c925dbb6cf43693
otel-collector  |     Parent ID      : a1b3a313243c9116
otel-collector  |     ID             : c6c3f02898627a2f
otel-collector  |     Name           : Message
otel-collector  |     Kind           : Producer
otel-collector  |     Start time     : 2024-05-01 14:38:35.0075534 +0000 UTC
otel-collector  |     End time       : 2024-05-01 14:38:35.0076055 +0000 UTC
otel-collector  |     Status code    : Unset
otel-collector  |     Status message :
otel-collector  | Attributes:
otel-collector  |      -> az.namespace: Str(Microsoft.ServiceBus)
otel-collector  |      -> messaging.system: Str(servicebus)
otel-collector  |      -> server.address: Str(xxxx.servicebus.windows.net)
otel-collector  |      -> messaging.destination.name: Str(orders)
otel-collector  |      -> az.schema_url: Str(https://opentelemetry.io/schemas/1.23.0)

// Send span
otel-collector  | ScopeSpans #1
otel-collector  | ScopeSpans SchemaURL:
otel-collector  | InstrumentationScope Azure.Messaging.ServiceBus.ServiceBusSender
otel-collector  | Span #0
otel-collector  |     Trace ID       : bdac052a452b23d96c925dbb6cf43693
otel-collector  |     Parent ID      : a1b3a313243c9116
otel-collector  |     ID             : 4d362ae1fee9c224
otel-collector  |     Name           : ServiceBusSender.Send
otel-collector  |     Kind           : Client
otel-collector  |     Start time     : 2024-05-01 14:38:35.0078473 +0000 UTC
otel-collector  |     End time       : 2024-05-01 14:38:35.0703801 +0000 UTC
otel-collector  |     Status code    : Unset
otel-collector  |     Status message :
otel-collector  | Attributes:
otel-collector  |      -> az.namespace: Str(Microsoft.ServiceBus)
otel-collector  |      -> messaging.system: Str(servicebus)
otel-collector  |      -> messaging.operation: Str(publish)
otel-collector  |      -> server.address: Str(xxxx.servicebus.windows.net)
otel-collector  |      -> messaging.destination.name: Str(orders)
otel-collector  |      -> az.schema_url: Str(https://opentelemetry.io/schemas/1.23.0)
otel-collector  | Links:
otel-collector  | SpanLink #0
otel-collector  |      -> Trace ID: bdac052a452b23d96c925dbb6cf43693
otel-collector  |      -> ID: c6c3f02898627a2f
otel-collector  |      -> TraceState:
otel-collector  |      -> DroppedAttributesCount: 0

// Receive + Complete part
otel-collector  | ScopeSpans #0
otel-collector  | ScopeSpans SchemaURL:
otel-collector  | InstrumentationScope Azure.Messaging.ServiceBus.ServiceBusReceiver
otel-collector  | Span #0
otel-collector  |     Trace ID       : a027daa1769107022fe96f01b380f5ee
otel-collector  |     Parent ID      :
otel-collector  |     ID             : db59a7456c2fa8c5
otel-collector  |     Name           : ServiceBusReceiver.Receive
otel-collector  |     Kind           : Client
otel-collector  |     Start time     : 2024-05-01 16:37:57.1256906 +0000 UTC
otel-collector  |     End time       : 2024-05-01 14:38:35.0709157 +0000 UTC
otel-collector  |     Status code    : Unset
otel-collector  |     Status message :
otel-collector  | Attributes:
otel-collector  |      -> az.namespace: Str(Microsoft.ServiceBus)
otel-collector  |      -> messaging.system: Str(servicebus)
otel-collector  |      -> messaging.operation: Str(receive)
otel-collector  |      -> server.address: Str(xxxx.servicebus.windows.net)
otel-collector  |      -> messaging.destination.name: Str(orders)
otel-collector  |      -> az.schema_url: Str(https://opentelemetry.io/schemas/1.23.0)
otel-collector  | Links:
otel-collector  | SpanLink #0
otel-collector  |      -> Trace ID: bdac052a452b23d96c925dbb6cf43693
otel-collector  |      -> ID: c6c3f02898627a2f
otel-collector  |      -> TraceState:
otel-collector  |      -> DroppedAttributesCount: 0
otel-collector  | Span #1
otel-collector  |     Trace ID       : bdac052a452b23d96c925dbb6cf43693
otel-collector  |     Parent ID      : 37e763ceecb05303
otel-collector  |     ID             : 79b56d97672881e1
otel-collector  |     Name           : ServiceBusReceiver.Complete
otel-collector  |     Kind           : Client
otel-collector  |     Start time     : 2024-05-01 14:38:35.0748708 +0000 UTC
otel-collector  |     End time       : 2024-05-01 14:38:35.1695553 +0000 UTC
otel-collector  |     Status code    : Unset
otel-collector  |     Status message :
otel-collector  | Attributes:
otel-collector  |      -> az.namespace: Str(Microsoft.ServiceBus)
otel-collector  |      -> messaging.system: Str(servicebus)
otel-collector  |      -> messaging.operation: Str(settle)
otel-collector  |      -> server.address: Str(xxxx.servicebus.windows.net)
otel-collector  |      -> messaging.destination.name: Str(orders)
otel-collector  |      -> az.schema_url: Str(https://opentelemetry.io/schemas/1.23.0)
otel-collector  | Links:
otel-collector  | SpanLink #0
otel-collector  |      -> Trace ID: bdac052a452b23d96c925dbb6cf43693
otel-collector  |      -> ID: c6c3f02898627a2f
otel-collector  |      -> TraceState:
otel-collector  |      -> DroppedAttributesCount: 0
otel-collector  | ScopeSpans #1
otel-collector  | ScopeSpans SchemaURL:
otel-collector  | InstrumentationScope Azure.Messaging.ServiceBus.ServiceBusProcessor
otel-collector  | Span #0
otel-collector  |     Trace ID       : bdac052a452b23d96c925dbb6cf43693
otel-collector  |     Parent ID      : c6c3f02898627a2f
otel-collector  |     ID             : 37e763ceecb05303
otel-collector  |     Name           : ServiceBusProcessor.ProcessMessage
otel-collector  |     Kind           : Consumer
otel-collector  |     Start time     : 2024-05-01 14:38:35.0709788 +0000 UTC
otel-collector  |     End time       : 2024-05-01 14:38:35.1738795 +0000 UTC
otel-collector  |     Status code    : Unset
otel-collector  |     Status message :
otel-collector  | Attributes:
otel-collector  |      -> az.namespace: Str(Microsoft.ServiceBus)
otel-collector  |      -> messaging.system: Str(servicebus)
otel-collector  |      -> messaging.operation: Str(process)
otel-collector  |      -> server.address: Str(xxxx.servicebus.windows.net)
otel-collector  |      -> messaging.destination.name: Str(orders)
otel-collector  |      -> az.schema_url: Str(https://opentelemetry.io/schemas/1.23.0)
otel-collector  |      -> messaging.masstransit.message_id: Str(dc450000-3349-a4f9-ebf6-08dc69fd251c)
otel-collector  |      -> messaging.message.conversation_id: Str(dc450000-3349-a4f9-fc38-08dc69fd251c)
otel-collector  |      -> messaging.masstransit.destination_address: Str(sb://xxxx.servicebus.windows.net/orders)

You can see the first two spans (Message and ServiceBusSender.Send) have the same parent id (and it's probably ok). But the activity Message started before the activity ServiceBusSender.Send, so it appears before and gives the impression that the process part (receive + process) didn't happen after the send activity:

Screenshot 2024-05-01

Maybe a solution to make it clearer is not to subscribe to all Azure.* sources, but only to Azure.Messaging.ServiceBus., Azure.Messaging.ServiceBus.ServiceBusReceiver and Azure.Messaging.ServiceBus.ServiceBusProcessor (in order to skip Azure.Messaging.ServiceBus.Message). But it makes things more complex because today the OTel SDK doesn't (easily) support source exclusion (to subscribe to all Azure.* and just exclude the one you're not interested in for example).

Also I wonder whether the activity ServiceBusProcessor.ProcessMessage should not have the ServiceBusSender.Send activity as its parent rather than the Message activity , as is the case today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team OpenTelemetry OpenTelemetry instrumentation (not Monitor-specific) question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus
Projects
None yet
Development

No branches or pull requests

4 participants