Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ 🔧 [Telemetry] Add metrics to measure health, latency, request rate to model providers #223

Open
roma-glushko opened this issue Apr 27, 2024 · 1 comment

Comments

@roma-glushko
Copy link
Member

Measure these metrics for all model providers configured:

  • the number of successful requests (counter)
  • the number of failed requests (counter)
  • the response latency (non-streaming lang chat requests)
  • request rate to each provider
  • the first chunk latency (streaming lang chat requests)
@roma-glushko roma-glushko added this to the Glide: Public Preview milestone Apr 27, 2024
@roma-glushko roma-glushko self-assigned this Apr 27, 2024
@roma-glushko roma-glushko removed their assignment May 1, 2024
@roma-glushko roma-glushko modified the milestones: Glide: Public Preview, Telemetry Setup May 1, 2024
@gernest
Copy link
Collaborator

gernest commented May 10, 2024

request rate to each provider

rate is a derived metric, it is computed by the time series store. Having total_requests counter is enough, you can compute rate of requests from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants