Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows PDFBox.PDFBox() fails at Urllib error #24

Open
Rammurthy5 opened this issue Aug 6, 2020 · 3 comments
Open

Windows PDFBox.PDFBox() fails at Urllib error #24

Rammurthy5 opened this issue Aug 6, 2020 · 3 comments

Comments

@Rammurthy5
Copy link

When I merely import pdbox, and initiate the PDFBox() function,
it immediately throws an error message as following. Please help

urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connection host has failed to respond>

@lebedov
Copy link
Owner

lebedov commented Aug 6, 2020

Looks a network connectivity issue. Is your computer connected to the Internet through a corporate firewall? If you start a python3 session on your Windows box, does the following code run without any exception?

import urllib
r = urllib.request.urlopen('https://archive.apache.org/dist/pdfbox/')
data = r.read()

@Rammurthy5
Copy link
Author

Rammurthy5 commented Aug 25, 2020

No I couldn't. it throws the same error. How do I fix it?
can I add proxy address to it ?

@lebedov
Copy link
Owner

lebedov commented Aug 25, 2020

You can try setting the environmental variable http_proxy or https_proxy (depending on the protocol) to the URI of your proxy before importing pdfbox.

Another possibility is to set the user agent to that of a common web browser, as some firewalls block HTTP requests that do not appear to come from the latter; try the following code and see whether it throws an error:

import urllib
req = urllib.request.Request(
    url='https://archive.apache.org/dist/pdfbox/',
    data=None,
    headers={
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0'
    }
)
r = urllib.request.urlopen(req)
data = r.read()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants