Update AsyncClient to use Semaphores #1916
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem: All requests in tweepy first try to make the request and if there is a rate limit in affect, then they choose to sleep until the rate limit is expired. In other words, they operate with no memory. This is fine in synchronous code, only a max of one request would be sleeping at once. However, in the AsyncClient, a user could send 1,000 requests within a second. The first ones would work, until the rate limit was met, then the rest would then start to sleep simultaneously. Which means they will all get released simultaneously.
What this leads to is a bunch of extra requests to Twitter that all return 429.
This pull requests implements a custom Semaphore that acts as a memory of the current rate limit and doesn't send more responses to Twitter than what is known can be supported. The Semaphore automatically resets at the reset time and releases more requests to Twitter.
Since different endpoints have different rate limits, the semaphores are stored in a defaultdict so a new semaphore is automatically created for each endpoint the user is requesting. It is understood that some endpoints share a rate limit. This was mostly ignored, but each response that is returned updates the semaphore, so if another request had dropped the value on another endpoint the semaphore eventually gets updated (I think).
I'm not set up to test tweepy (I don't have a twitter developer account), so if someone else can do some integration testing, that would be good. I did some tests in a mocked up scenario, but need to actually integrate with the Twitter API to make sure it works before this gets pulled.
Thanks for your consideration!