Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common Voice download fails with a 403 error #1324

Open
daniel-dona opened this issue Apr 17, 2024 · 2 comments
Open

Common Voice download fails with a 403 error #1324

daniel-dona opened this issue Apr 17, 2024 · 2 comments

Comments

@daniel-dona
Copy link
Contributor

Found testing icefall egs for commonvoice/ASR on ./prepare.sh

Running lhotse download commonvoice [...] results in a HTTP error:

2024-04-17 21:54:04,887 INFO [commonvoice.py:84] Language: fr
Downloading CommonVoice languages:   0%|                                                                                                                                                                                                                  | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/dani/.local/bin/lhotse", line 8, in <module>
    sys.exit(cli())
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dani/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/dani/.local/lib/python3.10/site-packages/lhotse/bin/modes/recipes/commonvoice.py", line 76, in commonvoice
    download_commonvoice(
  File "/home/dani/.local/lib/python3.10/site-packages/lhotse/recipes/commonvoice.py", line 105, in download_commonvoice
    resumable_download(
  File "/home/dani/.local/lib/python3.10/site-packages/lhotse/utils.py", line 543, in resumable_download
    raise e
  File "/home/dani/.local/lib/python3.10/site-packages/lhotse/utils.py", line 517, in resumable_download
    _download(req, file_size)
  File "/home/dani/.local/lib/python3.10/site-packages/lhotse/utils.py", line 499, in _download
    with urllib.request.urlopen(rq) as response:
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Looks like S3 downloads need to be signed...

@pzelasko
Copy link
Collaborator

Something might have changed on CommonVoice side. If the issue persists, it may be best to download directly from their site.

@daniel-dona
Copy link
Contributor Author

Something might have changed on CommonVoice side. If the issue persists, it may be best to download directly from their site.

We can use the same method they use in the original page, they have an API (undocumented)

https://gist.github.com/daniel-dona/e1bce1d8ab01284d019d087664127cba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants