Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend response duration is too high #3419

Closed
tcaty opened this issue Feb 22, 2024 · 4 comments
Closed

Backend response duration is too high #3419

tcaty opened this issue Feb 22, 2024 · 4 comments
Labels
stale Used for stale issues / PRs

Comments

@tcaty
Copy link

tcaty commented Feb 22, 2024

Describe the bug
Hi! We use Grafana Tempo in our team. We faced the issue recently that simple query takes so much time. How can we tune Tempo performance to see traces immediately?

Query:

{resource.service.name="${service_name}" && resource.env="${env}" && 400 <= .http.status_code && .http.status_code != 404 && .http.status_code < 600}

Response time:

image

To Reproduce
Steps to reproduce the behavior:

  1. Start chart tempo-operational v1.7.1
  2. Perform Operations (Read).
  3. Wait too long

Expected behavior
See traces immediately

Environment:

Additional Context
We use tempo-operational dashboard to monitor our Tempo instance. And there are what we see on screenshots below:

image
We think that there is the problem in Querier component directly. So we gave him a lot of resources, but it still works slowly.

querier:
  replicas: 2
  resources:
    requests:
      cpu: 2
      memory: 2Gi
    limits:
      cpu: 8
      memory: 10Gi

How can we boost perfomance?

@joe-elliott
Copy link
Member

There's lot of ways to improve the perf of TraceQL! Listed in the order I think you should consider them:

  1. An instant big win would be to add scopes to all of your attributes:
{resource.service.name="${service_name}" && resource.env="${env}" && 400 <= span.http.status_code && span.http.status_code != 404 && span.http.status_code < 600}
  1. Set up GRPC Streaming
    https://grafana.com/docs/tempo/latest/api_docs/#tempo-grpc-api
    This also (currently) requires setting a Grafana feature flag

  2. Configure dedicated columns:
    docs
    blogpost

  3. Use multiple caching layers which are added in 2.4:
    https://grafana.com/docs/tempo/next/configuration/#cache

  4. Search perf configurables
    This advice is a bit out of date, and only applies once you start scaling Tempo quite large. I would ignore the serverless parts (we have had issues getting good perf), but the major tunables discussion is still correct. If you are running 50+ queriers I would start to care about this.
    https://grafana.com/docs/tempo/latest/operations/backend_search/

@tcaty
Copy link
Author

tcaty commented Feb 24, 2024

@joe-elliott thank you for your reply! I'll try it on next week and give feedback about what really helped us!

Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity in the past 60 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity.
Please apply keepalive label to exempt this Issue.

@github-actions github-actions bot added the stale Used for stale issues / PRs label Apr 25, 2024
@tcaty
Copy link
Author

tcaty commented May 2, 2024

I apologize for such a long duration, so there is our feedback. We've used your 1, 3, and 4 advices and I can certainly say that their order by value in terms of performance is absolutely right for us. We've noticed good changes immediately by optimizing our TraceQL queries. The second one has helped a lot as well. We drop the most heavy and useless attributes in our collector and there are some results: our storage has been filling up more slower and tempo search has been working more faster since these changes have been made. And the fourth one has made the most minor improvements, anyway it's better than nothing :)
Thank you again, @joe-elliott!

@tcaty tcaty closed this as completed May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Used for stale issues / PRs
Projects
None yet
Development

No branches or pull requests

2 participants