Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document ways to configure trace sampling #14831

Closed
joaopgrassi opened this issue Apr 4, 2024 · 4 comments · Fixed by #14856
Closed

Document ways to configure trace sampling #14831

joaopgrassi opened this issue Apr 4, 2024 · 4 comments · Fixed by #14856

Comments

@joaopgrassi
Copy link
Member

Context

So far, there is 3 different ways to configure the Runtime random sampling in Istio:

In order of precedence:

All these options, configure a percentage based sampling mechanism. Although this is something, more robust sampling approaches exist. For example, the OpenTelemetry project defines several of them.

In a recent PR, I introduced a feature that allows users to configure such more "advanced" samplers to the OpenTelemetryTracingProvider. This now means we have yet another option to configure sampling.

One of the discussion in the PR was that we should clarify and demonstrate to users all this options and how one affect each other. This issue is to then track this documentation work.

How to improve it

Option 1:

We have this page Customizing Trace sampling which we show some of the options. We could simply expand that there and show all options and how one affect the other (which ones has precedence)

Pros:

  • No need for a new page

Cons:

  • Will talk about the OpenTelemetryTracingProvider although that's not related to that page. So it's a bit out of context

Option 2:

Introduce a new page under Distributed Tracing tasks specific for sampling. There we can document all the ways:

  1. New option for the OpenTelemetryTracingProvider with custom samplers: Takes precedence of all others and override Runtime random sampling to 100%. This is so all spans arrive at the sampler for its decision.
  2. defaultConfig.tracing.sampling default for everything
  3. telemetry.RandomSamplingPercentage can be changed per namespace
  4. PILOT_TRACE_SAMPLING is this still even worth documenting? The page here says it's discouraged https://istio.io/latest/docs/tasks/observability/distributed-tracing/mesh-and-proxy-config/#customizing-trace-sampling

I personally prefer option 2, as we can have examples and really drill down on the explanations, which I think will be very beneficial. When I started working with Istio I was very confused with all these different ways of setting this, and I didn't find any place that explained it.

CC @zirain @howardjohn

@howardjohn
Copy link
Member

Either one works for me. If we do 1 we need https://istio.io/latest/docs/tasks/observability/distributed-tracing/telemetry-api/ as well though.

Maybe we do 2, and make the "sampling" section in the current docs just link there

@zirain
Copy link
Member

zirain commented Apr 5, 2024

IMO, we should remove PILOT_TRACE_SAMPLING in the future, otherwise LGTM.

@joaopgrassi
Copy link
Member Author

Alright! I will try to put something together for a new page. Once I have a draft I will ping you both. Thanks!

@joaopgrassi
Copy link
Member Author

@howardjohn @zirain I created a draft PR #14856. Please let me know what you think. I'm open for ideas/suggestions. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants