Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: When there is a large amount of traces, calculating model costs and usage is very slow and ultimately yields no results #2034

Open
secsilm opened this issue May 11, 2024 · 7 comments
Labels
blocked-v3 bug Something isn't working performance

Comments

@secsilm
Copy link

secsilm commented May 11, 2024

Describe the bug

We have been using langfuse in the production environment, and have generated about 1.65 million traces in the past month. When I check the usage in the past week, the response time is still acceptable.

However, when I select the last month, the "model costs", "model usage", and "user consumption" sections take a long time to load (maybe 5-10 minutes?), and then the loading icon disappears without displaying any results. The CPU usage of postgres also surges.

image
Loading

image
CPU useage of postgres container

To reproduce

Generate a large number of traces and then view the dashboard.

SDK and container versions

self-hosting Langfuse: 2.38.0

Additional information

No response

Are you interested to contribute a fix for this bug?

No

@marcklingen
Copy link
Member

We’re currently preparing for v3 to address these performance issues of analytical queries: https://github.com/orgs/langfuse/discussions/1902

In the meantime, db IOPS is mostly the bottleneck that you could try to improve to make this faster

@arthurGrigo
Copy link

+1

I also have the feeling that there is quite some latency after a prompt chain completed and the full trace is available in the UI.

@marcklingen
Copy link
Member

+1

I also have the feeling that there is quite some latency after a prompt chain completed and the full trace is available in the UI.

Interesting, this should not be the case as the SDKs by default flush out events every second. How long do you need to wait?

@arthurGrigo
Copy link

arthurGrigo commented May 20, 2024

Have not measured it but sometimes 1 to 2 minutes for really complex prompt chains.

I should have mentioned that I run langfuse locally using docker-compose.

@marcklingen
Copy link
Member

Have not measured it but sometimes 1 to 2 minutes for really complex prompt chains.

I should have mentioned that I run langfuse locally using docker-compose.

thanks for sharing. this will dramatically improve with langfuse v3. In the meantime you could tweak the behavior by increasing the number of threads (docs) as I assume the sdk here is backlogged with events to send to the api.

@arthurGrigo
Copy link

Have not measured it but sometimes 1 to 2 minutes for really complex prompt chains.

I should have mentioned that I run langfuse locally using docker-compose.

thanks for sharing. this will dramatically improve with langfuse v3. In the meantime you could tweak the behavior by increasing the number of threads (docs) as I assume the sdk here is backlogged with events to send to the api.

Thanks for the hint!
The docs say one should only use it if really necessary. Are there any known drawbacks or bugs when increasing the number of threads?

@marcklingen
Copy link
Member

Have not measured it but sometimes 1 to 2 minutes for really complex prompt chains.

I should have mentioned that I run langfuse locally using docker-compose.

thanks for sharing. this will dramatically improve with langfuse v3. In the meantime you could tweak the behavior by increasing the number of threads (docs) as I assume the sdk here is backlogged with events to send to the api.

Thanks for the hint! The docs say one should only use it if really necessary. Are there any known drawbacks or bugs when increasing the number of threads?

Performance, as it creates additional background threads and might need more time when you try to flush/shutdown (joining the threads). If this is bearable for you right now, you might better wait for v3 or just try how things improve with threads=2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked-v3 bug Something isn't working performance
Projects
None yet
Development

No branches or pull requests

3 participants