Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic throttling to conform to Twitch rate limits #23

Open
RAnders00 opened this issue Oct 3, 2020 · 5 comments
Open

Automatic throttling to conform to Twitch rate limits #23

RAnders00 opened this issue Oct 3, 2020 · 5 comments
Labels
enhancement New feature or request

Comments

@RAnders00
Copy link
Collaborator

Add a mechanism to enable automatic throttling in case the message rate exceeds the limits imposed by Twitch. Should support known and verified bots.

@RAnders00 RAnders00 added the enhancement New feature or request label Oct 3, 2020
@johnpyp
Copy link

johnpyp commented Jan 17, 2021

What is the expected response if a connection is getting rate limited? At fast join rates with a lot of channels (testing with about 2k), it's pretty common for me to get a fair bit of these errors:

[2021-01-17T01:53:49Z INFO  twitch_irc::connection::event_loop] Closing connection, cause: Error received from incoming stream of messages: WebSocket protocol error: Connection reset without closing handshake

Is this indicative of rate limits, and twitch forcibly closing the websocket connection in response, or simply the nature of opening that many websockets at once? My guess would be the former, considering it should only be like 21 websocket connections.

The described twitch rate limits are "20 join attempts per 10 seconds per user", so the default join rate is obviously far faster than that. That said, even though there's an elevated rate of connection resets, it seems like the client climbs up to peak message throughput far faster than if I slow down the join rate settings.

I'm using the anonymous connection, and my current join rate settings (and the default ones I think):

max_channels_per_connection: 90,
connection_rate_limiter: Arc::new(Semaphore::new(1)),
new_connection_every: Duration::from_secs(2),

@RAnders00
Copy link
Collaborator Author

RAnders00 commented Jan 17, 2021

Sadly the whole rate-limiting on Twitch's end is not a very exact science, and Twitch has been very bad at documenting what's actually going on. So the fact that opening connections is rate-limited in my library is based entirely on my own empirical evidence. (Opening tons of connections at once means they all time out, I guess it's some kind of firewall, not even an application-level rate limit). The limit of 2 seconds waiting between connections is also just a "has worked well for me" value.

I personally run the recent-messages2 service, on which I use TLS connections (not WebSocket), my server is in Germany, and I'm able to connect and hold 720 parallel connections open - with the exact same settings that you posted above. (Joined to roughly 65 thousand channels).

Which is why I'm also kind of skeptical about the documented JOIN limits. My bot works fine even though I definitely go over the limit. But perhaps that's because the connections are anonymous. Yet again something that's not documented and means more guesswork for me implementing the library.

However I'm very open to more experimentation to find out whether WebSocket connection opening works different and perhaps needs different rate limit parameters or a different rate limiting strategy altogether. Could you do me a favour and try out some other values for the three parameters, (especially lowering max_channels_per_connection to the documented limit of 20, just to find out whether you're hitting that rate limit, and then also try increasing the new_connection_every duration to several seconds)

Also, this issue was originally about the PRIVMSG limits (known/verified bots). But we can discuss connection limits here too, I guess.

@johnpyp
Copy link

johnpyp commented Jan 17, 2021

Ah, sorry about posting in the wrong issue, since there's another one for the JOIN command, must've just clicked the wrong one.

It seems that after doing multiple runs on the same settings, significant amounts of timeouts don't actually happen that often, so the default settings might work fine. I remember attempting to join 50,000 channels and after ~30 seconds a massive cascade of failures occurred, but notably this was using TLS rather than the websocket. It seems right now that for my current amount of channels the connection failures are rather limited to the point that it isn't a big issue. I am interested to stress test further though, so I'll check with different settings at some point and report back with my findings.

@RAnders00
Copy link
Collaborator Author

Please do report back. I'm definitely trying to find out possible improvements to the rate limiting/connection strategy.

@demize
Copy link
Contributor

demize commented Oct 31, 2021

I added rate limiting to my bot with the governor crate, just for the global limits. I've got it set to 200 messages per minute right now, but I could take a crack at implementing it in the library if you think it'd be a good approach; I'd take a better approach with the governor (instead of Quota::per_minute(nonzero!(200u32) I'd build a quota manually to replenish one cell per 300ms for non-verified bots and one per 4000us for verified bots, with maximum bursts of 100 and 7500 messages respectively) and I'd put it behind a feature flag so only people who actually care would need to have it enabled. There's a convenient .until_ready() future that would be my initial approach to using it here, but it also provides a .check() function if returning an Error would be preferable to waiting for an open cell.

My main concern would be addressing the per-channel rate limits for verified bots; I don't think it's as serious of a concern as the global rate limits, but I'd prefer to address both if possible. The governor crate includes support for keyed rate limiters, and I imagine that would be the way to go: one direct rate limiter for the global limit, then one keyed rate limiter for all the channels.

Shouldn't take me a ton of work to get something working, though I'd like to make sure this sounds like a good approach to you first before I go poking around your code and make a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants