Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cfscrape on colab #439

Open
Medivenn opened this issue Mar 7, 2022 · 0 comments
Open

cfscrape on colab #439

Medivenn opened this issue Mar 7, 2022 · 0 comments
Labels

Comments

@Medivenn
Copy link

Medivenn commented Mar 7, 2022

Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.

Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.

Please confirm the following statements and check the boxes before creating an issue:

  • [ x] I've upgraded cfscrape with pip install -U cfscrape
  • [ x] I'm using Node version 10 or higher
  • [ x] The site protection I'm having issues with is from Cloudflare
  • [x ] I'm not using Tor, a VPN, or an anonymizing proxy

Python version number

Run python --version and paste the output below:

cfscrape version number

Run pip show cfscrape and paste the output below:

Version: 2.1.1
Summary: A simple Python module to bypass Cloudflare's anti-bot page. See https://github.com/Anorov/cloudflare-scrape for more information.
Home-page: https://github.com/Anorov/cloudflare-scrape
Author: Anorov
Author-email: [anorov.vorona@gmail.com](mailto:anorov.vorona@gmail.com)
License: UNKNOWN
Location: /usr/local/lib/python3.7/dist-packages
Requires: requests
Required-by:

Code snippet involved with the issue

import cfscrape
scraper = cfscrape.create_scraper()  
scraper.get("https://www.mytek.tn/informatique.html") 

Complete exception and traceback

(If the problem doesn't involve an exception being raised, leave this blank)

AttributeError                            Traceback (most recent call last)
[/usr/local/lib/python3.7/dist-packages/cfscrape/__init__.py](https://localhost:8080/#) in solve_challenge(self, body, domain)
    254                 r"(?:[^{<>]*},\s*(\d{4,}))?",
--> 255                 javascript, flags=re.S
    256             ).groups()

AttributeError: 'NoneType' object has no attribute 'groups'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
4 frames
[/usr/local/lib/python3.7/dist-packages/cfscrape/__init__.py](https://localhost:8080/#) in solve_challenge(self, body, domain)
    290             raise ValueError(
    291                 "Unable to identify Cloudflare IUAM Javascript on website. %s"
--> 292                 % BUG_REPORT
    293             )
    294 

ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.

Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."

URL of the Cloudflare-protected page

https://www.mytek.tn/informatique.html

URL of Pastebin/Gist with HTML source of protected page

[LINK GOES HERE]

@Medivenn Medivenn added the bug label Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant