Skip to content

option to be be able to use datasets behind a proxy #275

Description

@tarrade

Is your feature request related to a problem? Please describe.
Right now behing a proxy, it is not working:


ds_train = tfds.load(name="cats_vs_dogs", split=tfds.Split.TRAIN)

C:\Program Files\Anaconda3\envs\env_gcp_dl_2_0_ds\lib\site-packages\requests\adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    514                 raise SSLError(e, request=request)
    515
--> 516             raise ConnectionError(e, request=request)
    517
    518         except ClosedPoolError as e:
ConnectionError: HTTPConnectionPool(host='storage.googleapis.com', port=80): Max retries exceeded with url: /tfds-data/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x00000192F3D06668>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))

I don't think this is supported for now (I didn't see it in the documentation):
https://www.tensorflow.org/datasets/api_docs/python/tfds/load

This will impact quite a lot of people working in company and university

Describe the solution you'd like
I am not an expert but using request seems to be the standard way. Below on example from a Google GCP tool:

from google.cloud import storage
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS']=xxx
os.environ['HTTPS_PROXY']=xxx
os.environ['REQUESTS_CA_BUNDLE']=/xxx/xxx
client = storage.Client()

ignore the GOOGLE_APPLICATION_CREDENTIALS' whihc is specific to GCP. The user need to setup one or 2 env variables and everything is done in the backgroud (I guess this is using requests)

http://docs.python-requests.org/en/master/user/advanced/#ssl-cert-verification

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions