Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support simple download URL for archives #1629

Closed
bbarker opened this issue Oct 23, 2018 · 4 comments
Closed

Support simple download URL for archives #1629

bbarker opened this issue Oct 23, 2018 · 4 comments

Comments

@bbarker
Copy link

bbarker commented Oct 23, 2018

Currently URLs for downloaded archives appear to be indirect and do not include the file name extension (e.g wget) - ideally the filename of the archive should be preserved when downloading using a tool on the command line. Is it possible to change this?

@lnielsen
Copy link
Member

Thanks for reporting. AFAIK this is already supported. Example:

$ wget https://zenodo.org/api/files/4f53dd1f-df5f-4a9c-8b46-6eacfc4b8840/results.zip
--2018-10-24 08:33:10--  https://zenodo.org/api/files/4f53dd1f-df5f-4a9c-8b46-6eacfc4b8840/results.zip
Resolving zenodo.org... 137.138.76.77
Connecting to zenodo.org|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 309980977 (296M) [application/octet-stream]
Saving to: 'results.zip'

Do you have an example where it's not the case?

@benjaminhwilliams
Copy link

Hi @lnielsen,

https://zenodo.org/record/51405
This record, for example, contains files with the format
https://zenodo.org/record/<record number>/files/<filename>?download=1,
which is different to the
https://zenodo.org/api/files/<UID>/<filename>
format you give above. With reference to @bbarker's request, using wget on the former isn't possible.

This may be ignorance on my part. Is there a method I'm missing for getting the API-style permalink?

@lnielsen
Copy link
Member

lnielsen commented Nov 1, 2018

@benjaminhwilliams In both cases it's the same piece of code serving the file, the only difference is that you'll get a nice human error message on https://zenodo.org/record/<record number>/files/<filename>?download=1 for e.g. 404 pages.

In terms of wget and the ?download=1 then it's in my opinion wget misbehaving. You can however simply remove the ?download=1 to satisfy wget.

wget is misbehaving because we are infact sending the correct filename in the HTTP headers. See below:

$ curl -I "https://zenodo.org/record/51405/files/l-cyst_01.tar.gz?download=1"
HTTP/1.1 200 OK
...
Content-Disposition: attachment; filename=l-cyst_01.tar.gz
...

That said, if you need automated downloads, better use our REST API where you get direct file links:

$ curl https://zenodo.org/api/records/51405
{
  ...
  "files": [
    {
      "bucket": "cbc7d513-2359-47fe-a9c6-f826de7776c5",
      "checksum": "md5:780a7b23320307ae8b6cf2d6e99ade1f",
      "key": "l-cyst_fast_04.tar.gz",
      "links": {
        "self": "https://zenodo.org/api/files/cbc7d513-2359-47fe-a9c6-f826de7776c5/l-cyst_fast_04.tar.gz"
      },
      "size": 140654635,
      "type": "gz"
    },
    {
      "bucket": "cbc7d513-2359-47fe-a9c6-f826de7776c5",
      "checksum": "md5:c04800ec8ffaaad867ee54a3a1688ac5",
      "key": "l-cyst_very_fast_01.tar.gz",
      "links": {
        "self": "https://zenodo.org/api/files/cbc7d513-2359-47fe-a9c6-f826de7776c5/l-cyst_very_fast_01.tar.gz"
      },
      "size": 63254814,
      "type": "gz"
    },
  ...
}

@ldhulipala
Copy link

Thanks @lnielsen. Curl'ing the api/records/record_id worked for fetching wget'able urls. It would be helpful to surface these wget'able urls directly on the site, if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@bbarker @ldhulipala @lnielsen @benjaminhwilliams and others