Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorry, Stack Overflow blocked our request. Try again in a minute. #84

Open
aahnik opened this issue Nov 30, 2020 · 11 comments
Open

Sorry, Stack Overflow blocked our request. Try again in a minute. #84

aahnik opened this issue Nov 30, 2020 · 11 comments
Labels
bug Something isn't working high-priority

Comments

@aahnik
Copy link

aahnik commented Nov 30, 2020

I am a new user of rebound, I have just installed it today ( via pip ).

I am getting this every time:

Sorry, Stack Overflow blocked our request. Try again in a minute.

I am running on Ubuntu 20.04, and I have made sure that I have proper internet connectivity.

@Koubae
Copy link

Koubae commented Dec 1, 2020

Having the same issue here, just pip installed yesterday but getting the above mentioned error.

@shobrook
Copy link
Owner

shobrook commented Jan 3, 2021

Hey all, I'm aware of this issue. It seems that Stack Overflow has gotten stricter about bot detection and is doing a captcha check every time rebound makes a request. One solution is to use the StackExchange API instead of a web scraper, but this would require rebound users to register for an API token. It would also require a refactor of rebound.py. Please let me know if you have any other ideas or would like to work on this.

@shobrook shobrook added bug Something isn't working high-priority labels Jan 3, 2021
@hotheadhacker
Copy link

It would take lot if time to change to API module from a webscrapper and API calls won't be enough and will be a bottleneck.

Why don't we use advance webscrapper?

@shobrook
Copy link
Owner

@hotheadhacker Like what?

@hotheadhacker
Copy link

Let me fork this and explain

@hotheadhacker
Copy link

@hotheadhacker Like what?

@shobrook I have bypassed bot detection. But in case it occurs the user is prompted with a link to stackoverflow to whitelist themselves. Works fine. You can check my forked repository.

Fork: https://github.com/hotheadhacker/rebound

@cristicretu
Copy link

cristicretu commented Jan 20, 2021

@shobrook I think I know what the problem is. Doing many requests, in a short amount of time and using random user agents for every request will trigger the captcha every time. I suggest using a single user-agent / answer search, randomizing it only when the program is run (or just using a fixed UA, but that isn't' a very good idea). I don't know if it will fix it, but it certainly is a step in the right direction. Another option is to the the user's UA from the default browser, so it doesn't differ from normal browsing. I will look later and try to fix it. I changed the UA to be randomized only when the program is run, and also fixed some minor anti-pattern issues and cleaning up the code.

You can check my fork here: https://github.com/cristicretu/rebound

I will also try using a unique user agent: Google's Googlebot user agent (https://developers.google.com/search/blog/2019/10/updating-user-agent-of-googlebot). It sometimes fixed the capcha issues.

@shobrook
Copy link
Owner

shobrook commented Jan 24, 2021

Thanks @cristicretu and @hotheadhacker. It seems like the user-agents are the issue here. Is there a reason why we can't just remove the list of user-agents and use the user's default agent when making the request to SO?

@cristicretu
Copy link

Is there a reason why we can't just remove the list of user-agents and use the user's default agent when making the request to SO?

That is the only solution, I think. Getting the user's default agent is a little bit tricky, but I will try to do it.

My idea is to open with webbrowser a tab where you can get the UA, then parse it to the script and then continue. This should be done only at the first time of executing, and then it should store the info.

Do you have another idea? @shobrook

@surajawal7
Copy link

Hi @shobrook, I somehow managed to work with the captcha but it has some dependencies.

Workflow:

  • Run the current script
  • If captcha page comes up, try to solve the captcha
  • If captcha is only ticking a checkbox, it will pass. If advanced captcha shows up, it will redirect the user to the manual verification in chrome

I have to start Google Chrome in debugging mode first and use Selenium to interact with the captcha. Dependency on opening Google Chrome in debugging mode and using Selenium web driver. This may cause issues based on the device and platform. But, using this method, I find that captcha solving if done once, it will not occur until the Chrome in debugging mode is restarted or in best case, captcha also does not shows up after restart of Chrome.

@tiagoarodrigues55
Copy link

Hi guys, did you come to any conclusions? Need help fixing the issue? I'm having the same problem and I thought the idea of ​​the project was very popular, I wanted to see it work...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working high-priority
Projects
None yet
Development

No branches or pull requests

7 participants