Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize request intervals for full archive search with OAuth 2.0 Bearer Token for pagination.py #1923

Closed
wants to merge 1 commit into from

Conversation

TomatenMarc
Copy link

With this PR I would like to solve the issues raised in #1688 #1907 and #1871 of missing wait time between requests for Twitter api v2, which causes a direct exceeding of rate limits during a full archive search.

The problem seems not to be in https://github.com/tweepy/tweepy/blob/master/tweepy/client.py as in #1871, but rather in https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py as described in #1688.

It is a way to simply wait one second at the end of each request and the processing time of the data, but this always requires consideration and knowledge of the problem by the user.
Likewise, it may happen that the optimal time windows of 1 request per second and 300 requests per 900 requests (3 seconds for request + processing) cannot be met if 1 second is always added to the request and processing time.

Mainly, though, I think the problem should be solved within Tweepy, since the user (at least I did and take some time to find the problem) assumes that Tweepy implements Twitter's guidelines.

Since the Twitter guidelines only require a limit of 1 request per second for the full archive search /2/tweets/search/all in combination with OAuth 2.0 Bearer Token cf. https://developer.twitter.com/en/docs/twitter-api/tweets/search/migrate it seems appropriate to measure the time from the beginning in __next__ of https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py and to fill up the next second after receiving the response.

Cheers :-)

@vgewilliam
Copy link

With this PR I would like to solve the issues raised in #1688 #1907 and #1871 of missing wait time between requests for Twitter api v2, which causes a direct exceeding of rate limits during a full archive search.

The problem seems not to be in https://github.com/tweepy/tweepy/blob/master/tweepy/client.py as in #1871, but rather in https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py as described in #1688.

It is a way to simply wait one second at the end of each request and the processing time of the data, but this always requires consideration and knowledge of the problem by the user. Likewise, it may happen that the optimal time windows of 1 request per second and 300 requests per 900 requests (3 seconds for request + processing) cannot be met if 1 second is always added to the request and processing time.

Mainly, though, I think the problem should be solved within Tweepy, since the user (at least I did and take some time to find the problem) assumes that Tweepy implements Twitter's guidelines.

Since the Twitter guidelines only require a limit of 1 request per second for the full archive search /2/tweets/search/all in combination with OAuth 2.0 Bearer Token cf. https://developer.twitter.com/en/docs/twitter-api/tweets/search/migrate it seems appropriate to measure the time from the beginning in __next__ of https://github.com/tweepy/tweepy/blob/master/tweepy/pagination.py and to fill up the next second after receiving the response.

Cheers :-)

Hi dude, thanks for your contribution, it works for me.

@TomatenMarc
Copy link
Author

@vgewilliam Is there anything else to take into account or can this be merged? :-)

@TomatenMarc TomatenMarc closed this by deleting the head repository Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants