-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large flush_interval
causing odd synchronization behavior across all clients
#215
Comments
I think it's somewhat related to what I reported here. Statsite calculate flushtime relatively to the first run, not absolutely every defined period. So the execution time of the flush function may cause distortion on the long run. My suggestion is to have an absolute timer. |
@luca3m Reading through your ticket I also would prefer an absolute timer interval as opposed to a relative one. I believe that this behavior is being caused by something in the |
|
I am not too sure about portability; but would it not be easier to just implement the timer system using the POSIX Using |
After running
statsite
for a few days with a flush interval of one hour, all clients seem to be synchronizing their flush times after being somewhat randomly distributed to begin with.After two days of uptime, it seems as if roughly 80+% of clients are logging at exactly the same time, exactly 30m into the hour. They were somewhat evenly distributed in [0m, 40m] to begin with. Is there anything that is being done that is inherently causing this synchronization? This presents an issue as when logging to a 3rd-party service from
statsite
all requests are hitting the service at the same time when it would be significantly better to have them be more evenly distributed. I could combat it with a random sleep on my end.The image below is taken over roughly 48 hours. Each row in the image has its own scale, so the bar heights don't mean much; what's interesting is the distribution of the clients' log attempts during the hour interval. As time goes on (downwards), we can see the clustering around the 30m mark on the hour. If one were to zoom in on the cluster at the 30m mark, it is incredibly tight (within 30s) of exactly the 30m mark.
The text was updated successfully, but these errors were encountered: