Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I resume downloads? #490

Open
mptnan opened this issue Jun 2, 2022 · 16 comments
Open

How can I resume downloads? #490

mptnan opened this issue Jun 2, 2022 · 16 comments

Comments

@mptnan
Copy link

mptnan commented Jun 2, 2022

I'm downloading from transfer.sh url like wget -c "https://transfer.sh/..." but in the middle way an error occurs stating that "...バイトで読み込みエラーが発生しました (Operation timed out)。 再試行しています。".
Then it restarts downloading but from the scratch (progress bar starts from 0%). I read this #289 but didn't get the solution.

@aspacca
Copy link
Collaborator

aspacca commented Jun 2, 2022

from wget documentation

-c
--continue
    Continue getting a partially-downloaded file.  This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.  For instance:

            wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z

    If there is a file named ls-lR.Z in the current directory, Wget will assume that it is the first portion of the remote file, and will ask the server to continue the retrieval from an
    offset equal to the length of the local file.

    Note that you don't need to specify this option if you just want the current invocation of Wget to retry downloading a file should the connection be lost midway through.  This is the
    default behavior.  -c only affects resumption of downloads started prior to this invocation of Wget, and whose local files are still sitting around.

    Without -c, the previous example would just download the remote file to ls-lR.Z.1, leaving the truncated ls-lR.Z file alone.

@mptnan
Copy link
Author

mptnan commented Jun 2, 2022

Sorry but I'm always putting "wget -c url" and what's wrong with that

@aspacca
Copy link
Collaborator

aspacca commented Jun 3, 2022

I'm sorry @niss88, you are right
range donwload works only for video and audio file: https://github.com/dutchcoders/transfer.sh/blob/main/server/handlers.go#L1054

I will fix as soon as I have time

@mptnan
Copy link
Author

mptnan commented Jun 3, 2022

My downloading file was a video... but I'll wait either way.

For testing I uploaded mp4 file for example https://transfer.sh/KnDl1c/out3.mp4 (~1 MB), tested like below and --continue (-c) didn't work for mp4 file

*****.. [13:56:54] tmp% ls                                         
*****.. [13:57:40] tmp% wget -c https://transfer.sh/KnDl1c/out3.mp4
--2022-06-03 13:57:55--  https://transfer.sh/KnDl1c/out3.mp4
transfer.sh (transfer.sh) をDNSに問いあわせています... 144.76.136.153, 2a01:4f8:200:1097::2
transfer.sh (transfer.sh)|144.76.136.153|:443 に接続しています... 接続しました。
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 942941 (921K) [video/mp4]
`out3.mp4' に保存中

out3.mp4                                     87%[=============================================================================>            ] 807.41K   225KB/s   残り1s      ^C
*****.. [13:58:01] tmp% wget -c https://transfer.sh/KnDl1c/out3.mp4
--2022-06-03 13:58:07--  https://transfer.sh/KnDl1c/out3.mp4
transfer.sh (transfer.sh) をDNSに問いあわせています... 2a01:4f8:200:1097::2, 144.76.136.153
transfer.sh (transfer.sh)|2a01:4f8:200:1097::2|:443 に接続しています... 接続しました。
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 942941 (921K) [video/mp4]
`out3.mp4' に保存中

out3.mp4                                     32%[===========================>                                                              ] 295.37K   177KB/s               ^C
*****.. [13:58:11] tmp%

@aspacca
Copy link
Collaborator

aspacca commented Jun 6, 2022

hello @niss88

after some tests I can confirm that the code handles download resume on video and audio file (we could change for every file type, but that's no related to your problem)

failure on download resume at https://transfer.sh could be due to a proxy that doesn't forward the range header (it was indeed the same on my private instance): @stefanbenten , can you validate this assumption?
do you have a proxy in front of the service?

in case of nginx I had to add the proxy_pass_request_headers on; directive, btw

@aspacca
Copy link
Collaborator

aspacca commented Jun 6, 2022

filtering out range header to the transfer.sh service will also break seeking feature on the audio/video preview page

@stefanbenten
Copy link
Collaborator

transfer.sh is currently running with nginx in front and the correct proxy pass directive added. That said, it has a timeout at 1800s per request, due to the fact that there have been HTTP slow attacks in the past.

@aspacca
Copy link
Collaborator

aspacca commented Jun 6, 2022

@stefanbenten
taking @niss88 upload (https://transfer.sh/KnDl1c/out3.mp4), seeking does not work on the video player, that should be related to the range header

@stefanbenten
Copy link
Collaborator

I think i have an idea why it is not working correctly. Lemme debug the code (because the proxy forwards it correctly) and put up a PR.

@ErrorNoInternet
Copy link

This is also affecting parallel download tools (tools that connect to the server multiple times and download segments of the file in each thread). It makes a HEAD request to see if the server gives the Accept-Ranges: bytes header, and then sends a GET request containing the Range: bytes=0-N header to download the first N bytes.

@aspacca
Copy link
Collaborator

aspacca commented Jul 9, 2022

@ErrorNoInternet did you hit this issue with transfer.sh, intended as software, or on https://transfer.sh website?

can you tell me one of the parallel download tools that are affected?

I will check if everything is fine on the transfer.sh side.

For sure Range: bytes= requests are currently supported only for video and audio content type. We can widen support for every file.

@ErrorNoInternet
Copy link

@aspacca

did you hit this issue with transfer.sh, intended as software, or on https://transfer.sh/ website?

The https://transfer.sh website

can you tell me one of the parallel download tools that are affected?

Some examples are Paralload (my own tool), Chrome (it has a multi-threaded download feature), chunked-downloader, and a few more small tools.
My ISP throttles download speeds (sometimes down to 20 KB/s) for foreign sites, so I rely on these tools to download files faster.

@aspacca
Copy link
Collaborator

aspacca commented Jul 10, 2022

HEAD requests do not return Accept-Ranges: bytes, I will fix that, as well of allowing partial download for every file type

this is related to the support on transfer.sh intended as software.

hosted instances like https://transfer.sh, might or might nor support this according to the presence of a proxy and its setup

@aspacca
Copy link
Collaborator

aspacca commented Jul 10, 2022

indeed it's a rather tricky: we have a mutex on checkMetadata (

s.lock(token, filename)
defer s.unlock(token, filename)
) in order to track the MaxDownloads and update them

parallel range requests will be handled sequentially anyway because of the mutex.
I cannot see an easy way to get rid of the mutex: just skipping it in case of range requests will be prone to abuse (I do an HEAD and just make a range one with a full range but one byte)

keeping track of all the requested ranges to understand when a full donwload happened is a no way to go, since there are a lot of cases where range requests are not a parallel download (seeking in a multimedia content for example)

skipping the lock if the upload doesn't contain MaxDownloads is troublesome as well: since we'll know that after checking the metadata, but if it was the case we should have acquired the lock. I will see if I can move the lock before updating the Downloads counter (we have to re-check the metadata just before indeed because of race conditions).
I have to find a way where this is producing all the required goals:

  1. block the donwload as soon as the N+1th request is made with N equal to MaxDownloads
  2. update Downloads in metadata without race conditions
  3. if no MaxDownloads is set allow for parallel downloads

@aspacca
Copy link
Collaborator

aspacca commented Jul 10, 2022

@ErrorNoInternet here: #495

@aspacca
Copy link
Collaborator

aspacca commented Dec 22, 2022

@niss88 allowing parallel range request showed to be more complicated than I initially thought, especially regarding performance impact.

I will reserve to go back to the issue once I'll have more time for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants