Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: time parameters for querying trace by id #4150

Open
alburthoffman opened this issue Jan 11, 2023 · 5 comments · May be fixed by #4352
Open

[Feature]: time parameters for querying trace by id #4150

alburthoffman opened this issue Jan 11, 2023 · 5 comments · May be fixed by #4352

Comments

@alburthoffman
Copy link

Requirement

add start_time and end_time as optional parameters in https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v2/query.proto#L37

Problem

when using with Granafa console, the trace id view takes a little long time to load.

image

This is because the query has to scan all traces in the db. we have about 4M sampled traces per day and the data is growing, and we store several days data in the database.

Proposal

it would be good to add time parameters when querying by trace id. like tempo API https://grafana.com/docs/tempo/latest/api_docs/#query.

grafana console already have the time window.

Open questions

No response

@yurishkuro
Copy link
Member

What storage backend are you using?

I'm not particularly opposed to this change. Even though I do not recall similar complaints, I assume if someone uses ES as storage with indices rotated regularly, then having a time range as a hint might narrow down which indices to query. However, I am not sure how that would work when people use index alias, it would require some kind of support on ES side to use the time range hint.

Other official Jaeger backends are kv-stores and don't need help looking up by trace ID.

@alburthoffman
Copy link
Author

our backend is clickhouse. Clickhouse is not good at querying item by id, especially when the id is actually a random number.

We simply have too many traces, about 4 billion traces per day. that would need lots of memory if holding them in memory.

for kv store, it can simply ignore the start and end time.

this feature sounds like more general case as the API assume global scanning without any other options.

@vjsamuel
Copy link

vjsamuel commented Feb 1, 2023

@yurishkuro, this would greatly benefit us given the volume of spans we ingest into our click house cluster. we would be more than happy to contribute the enhancement as well.

@yurishkuro
Copy link
Member

go for it.

Speaking of ClickHouse: are you using https://github.com/jaegertracing/jaeger-clickhouse ? I recently opened #4196. There is another implementation in OTel Collector Contrib, which has an additional small table, as I understand for looking up time range based on trace-id

@alburthoffman
Copy link
Author

@yurishkuro Thx for the information. we had evaluated the index table solution long ago. It does not help too much as the trace id is a random number. Clickhouse index does not help with random numbers.

When the traffic volumn is not high, the index table solution can be used.

I will work on the time parameters and send the PR. Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants