Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for "condensed" statsd protocol. #116

Open
drawks opened this issue Mar 18, 2015 · 11 comments
Open

Support for "condensed" statsd protocol. #116

drawks opened this issue Mar 18, 2015 · 11 comments

Comments

@drawks
Copy link
Contributor

drawks commented Mar 18, 2015

In addition to the standard single update string protocol, the etsy (reference) implementation of stats supports appending multiple timer updates for a single metric as well as multiple different types of updates for a single key to be combined:

I can't find official documentation for this optimization, but you can see a proxy which implements pre-processing datagrams using this technique at https://github.com/juliusv/ne-statsd-backend and sure enough if you test the example updates against the etsy statsd they all work as described.

This can provide a nice bump in wire efficiency for some use patterns:

Here are few inline examples in case the above url goes dead:

foo:8|ms
foo:5|ms
foo:3|ms

becomes:

foo:8|ms:5|ms:3|ms

which is 33% smaller in this case and could easily be much more efficient in the case of a longer metric name (which would be typical) or a metric with a very high update frequency.

or

foo:1|c
foo:3|c
foo:5|ms
foo:3|ms
foo:6|g
foo:7|g

becomes:

foo:4|c:5|ms:3|ms:7|g

which is something like 50% smaller with similar notes as above WRT metric name size and update frequency.

@armon
Copy link
Collaborator

armon commented Mar 18, 2015

Interesting. At this point, this is not something I have time to tackle. The current ASCI format parser has been fine tuned for it's current performance, so doing a casual attempt at this would likely result in a dramatic loss of parsing performance.

@sleepybishop
Copy link
Contributor

i replaced the manual parser with a ragel based one without a drop in performance at sleepybishop/statsite@eeca0c0

i then implemented condensed protocol at sleepybishop/statsite@d6e82a7

there seems to be a ~3-5% drop in performance in my benchmarks. i wouldn't call it dramatic but at the same time i don't see the wide use case for the condensed protocol.

YMMV depending on you machine specs.

@drawks
Copy link
Contributor Author

drawks commented Dec 22, 2015

Woh! Good stuff 👏

@shamil
Copy link

shamil commented May 2, 2016

Will this be implemented/merged? I see some statsd client libraries are using explicity this format. Like kamon-statsd.

@sleepybishop
Copy link
Contributor

doubtful, its a fairly intrusive change that would affect all users regardless of whether they need the feature or not. i do make an effort to keep the ragel_parser branch up to date though.

@drawks
Copy link
Contributor Author

drawks commented May 3, 2016

It's sort of disappointing that adopting a protocol compliant parser which, by all appearances, causes only a very small performance regression in favor of much more easy to read/understand/maintain parser code is seen as an "intrusive change" and is unlikely to be adopted. I would expect that maintaining compatibility with the protocol as defined in the reference implementation would be a higher priority than a small performance regression which is unlikely to impact most users.

@sleepybishop
Copy link
Contributor

ill preface this by saying it's ultimately not my call as to whether or not it gets merged, i was only aiming to be helpful when i wrote the patch. the code modified had been stable for years, apologies if the "intrusive" label does not seem accurate.

i dont think armon is against merging deep changes as he was quite understanding when i replaced the event system with the ae loop and im quite open to making any changes needed to make it more palatable for merge consideration, but there are a lot of users for which performance is king which is probably why they chose statsite in the first place, negatively impacting those users doesn't seem like the right thing to do.

@shamil
Copy link

shamil commented May 6, 2016

@sleepybishop , maybe we should have compile time option, like --with-ragel-parser. So we can decide which parser to use and everybody will stay happy :)

@sleepybishop
Copy link
Contributor

fair enough, ill work on a pr to do some refactoring to make a parser change more straightforward and see if we can get that merged in first.

@drawks
Copy link
Contributor Author

drawks commented May 8, 2017

Happy birthday to this issue! @sleepybishop any chance you've worked on this at all?

@sleepybishop
Copy link
Contributor

ive worked on it some and ive kept my branch up to date but i have not made much progress beyond that. apologies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants