Optimize request intervals for full archive search with OAuth 2.0 Bearer Token for pagination.py #1923
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With this PR I would like to solve the issues raised in #1688 #1907 and #1871 of missing wait time between requests for Twitter api v2, which causes a direct exceeding of rate limits during a full archive search.
The problem seems not to be in https://github.com/tweepy/tweepy/blob/master/tweepy/client.py as in #1871, but rather in https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py as described in #1688.
It is a way to simply wait one second at the end of each request and the processing time of the data, but this always requires consideration and knowledge of the problem by the user.
Likewise, it may happen that the optimal time windows of 1 request per second and 300 requests per 900 requests (3 seconds for request + processing) cannot be met if 1 second is always added to the request and processing time.
Mainly, though, I think the problem should be solved within Tweepy, since the user (at least I did and take some time to find the problem) assumes that Tweepy implements Twitter's guidelines.
Since the Twitter guidelines only require a limit of 1 request per second for the full archive search
/2/tweets/search/all
in combination with OAuth 2.0 Bearer Token cf. https://developer.twitter.com/en/docs/twitter-api/tweets/search/migrate it seems appropriate to measure the time from the beginning in__next__
of https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py and to fill up the next second after receiving the response.Cheers :-)