Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NH-64718: otlp processor to support sampling and response time metrics #101

Merged
merged 31 commits into from
Jun 4, 2024

Conversation

xuan-cao-swi
Copy link
Contributor

@xuan-cao-swi xuan-cao-swi commented Jan 9, 2024

Description

Based on OTLP_METRICS to determine which processor is required for ruby agent.
otlp_processor take metrics_exporter to send metrics to otlp procotol collector

Test

With otel extension from swi: trace
Directly send to collector api: trace

@xuan-cao-swi xuan-cao-swi marked this pull request as ready for review March 27, 2024 16:11
@xuan-cao-swi xuan-cao-swi requested a review from a team as a code owner March 27, 2024 16:11
@xuan-cao-swi xuan-cao-swi changed the title NH-64718: (WIP) otlp processor NH-64718: otlp processor Mar 27, 2024
@cheempz cheempz changed the title NH-64718: otlp processor NH-40675 and NH-64718: otlp processor Apr 1, 2024
jerrytfleung
jerrytfleung previously approved these changes May 7, 2024
Copy link

@jerrytfleung jerrytfleung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lib/solarwinds_apm.rb Outdated Show resolved Hide resolved
@cheempz cheempz changed the title NH-40675 and NH-64718: otlp processor NH-64718: otlp processor to support sampling and request metrics May 22, 2024
@cheempz cheempz changed the title NH-64718: otlp processor to support sampling and request metrics NH-64718: otlp processor to support sampling and response time metrics May 22, 2024
Copy link
Contributor

@cheempz cheempz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another batch of comments @xuan-cao-swi -- let's chat if you'd like to address some of these in follow-on PRs since this one is getting a bit long and afaict the existing solarwinds processor might need some fixing which is a bit out of scope.

ext/oboe_metal/extconf.rb Show resolved Hide resolved
lambda/layer/Gemfile Show resolved Hide resolved
lambda/layer/otel_wrapper.rb Show resolved Hide resolved
lib/solarwinds_apm.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/opentelemetry/otlp_processor.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/opentelemetry/otlp_processor.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/otel_lambda_config.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/otel_lambda_config.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/support/service_key_checker.rb Outdated Show resolved Hide resolved
@xuan-cao-swi xuan-cao-swi requested a review from cheempz May 28, 2024 18:01
Copy link
Contributor

@cheempz cheempz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good @xuan-cao-swi! One more batch of comments, this should be it once addressed.

lib/solarwinds_apm/api/transaction_name.rb Outdated Show resolved Hide resolved
SolarWindsAPM.logger.debug { "[#{self.class}/#{__method__}] processor on_finish span: #{span.to_span_data.inspect}" }

# return if span is non-entry span
return if span.parent_span_id != ::OpenTelemetry::Trace::INVALID_SPAN_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect if it returns even for an entry span where the parent is remote. After some digging I think you're not using the non_entry_span method here because on_finish doesn't get passed the parent context, and unfortunately the OTel Ruby SDK for span only makes span.parent_span_id available and not the actual span parent, correct?

One solution is something like our JS custom distro's https://github.com/solarwinds/apm-js/blob/main/packages/sdk/src/parent-info-processor.ts. Another might be to have non_entry_span check for the presence of our proprietary service entry internal attrs attributes which the custom sampler sets... although it may be unsafe to assume our custom sampler will always be used, e.g. tail based sampling. So one other thought is an internal sw.is_entry_span: true attr could be set during on_start for similar purpose.

Whatever the approach, we do need this entry span logic to be correct because otherwise the metrics collected and shown to customers will be very wrong. Let's also chat about how to test for the correct trace metrics, it is a bit hard to automate but I'd like to at least have a session where we go over what the expectations are and how to verify the results in SWO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and unfortunately the OTel Ruby SDK for span only makes span.parent_span_id available and not the actual span parent, correct?

Yes, I think that's the reason I didn't use non_entry_span

So one other thought is an internal sw.is_entry_span: true attr could be set during on_start for similar purpose.

I think I will go with this solution. I will come up with some test case for this.

lib/solarwinds_apm/opentelemetry/otlp_processor.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/opentelemetry/otlp_processor.rb Outdated Show resolved Hide resolved
@exporter&.export([span.to_span_data]) if span.context.trace_flags.sampled?
return
end
return if span.parent_span_id != ::OpenTelemetry::Trace::INVALID_SPAN_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for otlp_processor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

lib/solarwinds_apm/opentelemetry/solarwinds_processor.rb Outdated Show resolved Hide resolved
lib/solarwinds_apm/otel_lambda_config.rb Outdated Show resolved Hide resolved
xuan-cao-swi and others added 4 commits May 29, 2024 10:16
Co-authored-by: Lin Lin <lin.lin@solarwinds.com>
Co-authored-by: Lin Lin <lin.lin@solarwinds.com>
Co-authored-by: Lin Lin <lin.lin@solarwinds.com>
Copy link
Contributor

@cheempz cheempz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! You may want to hold off merging until we hear back from Grzegorz re: sw.is_entry_span

@xuan-cao-swi xuan-cao-swi merged commit 22c619e into main Jun 4, 2024
12 checks passed
@xuan-cao-swi xuan-cao-swi deleted the NH-64718-b branch June 4, 2024 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants