Skip to content
This repository has been archived by the owner on Feb 18, 2021. It is now read-only.

Statsrelay fault tolerance #64

Open
fabeschan opened this issue Jun 24, 2016 · 4 comments
Open

Statsrelay fault tolerance #64

fabeschan opened this issue Jun 24, 2016 · 4 comments

Comments

@fabeschan
Copy link

Hey guys,

I was under the impression that if a statsd host goes down, the statsrelay would divert metrics that would otherwise be routed to the dead statsd host -- and this takes advantage of the consistent hashring. But it seems like this doesn't actually happen. Is this intended, or a bug?

@mtrienis
Copy link

+1

@JeremyGrosser
Copy link
Contributor

This is intended... If statsrelay diverted metrics to a different statsd instances, then you'd potentially have two statsd instances writing the same key,timestamp tuple to graphite with different values, neither of which would include all data for that key. Statsrelay's use case is really more focused around performance, where a single statsd/statsite process cannot keep up with the volume of metrics you're sending.

@fabeschan
Copy link
Author

Thank you for the explanation @JeremyGrosser... if this is the intention, what would you recommend should be done in the case of failed nodes?

@JeremyGrosser
Copy link
Contributor

You might want to take a look at the Lyft fork (https://github.com/lyft/statsrelay), it supports sending metrics to multiple backends simultaneously... This way you could run two sets of carbon servers for redundancy.

InfluxDB is worth a look too... It would replace your carbon servers for persistence and has it's own replication/sharding implementation. I had quite a few issues last time I tried it, but I've heard it's gotten more stable since then.

theatrus referenced this issue in lyft/statsrelay Oct 29, 2017
NOT PASSING TESTS
theatrus referenced this issue in lyft/statsrelay May 1, 2020
NOT PASSING TESTS
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants