Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "Requested data is before the reader's current offset" for Upload(stream) when resumption creates a new upload #580

Open
nh2 opened this issue Apr 14, 2023 · 4 comments
Labels

Comments

@nh2
Copy link
Contributor

nh2 commented Apr 14, 2023

The error Requested data is before the reader's current offset from

if (start < this._bufferOffset) {
return Promise.reject(new Error("Requested data is before the reader's current offset"))

is triggered when the following conditions are met jointly:

  • new Upload(new ReadableStream(...)) is used
  • chunkSize is set and the upload progressed to the second or higher chunk
  • a transmission error occurs and resumption fails (e.g. because the server temporarily gives HTTP 502 Bad Gateway, or any of these designed situations occurs

(These conditions are true in common cases, e.g. when you're behind Cloudflare (requiring chunkSize) and you restart your TUS server software for a split second for an upgrade.)

Then, this code triggers:

tus-js-client/lib/upload.js

Lines 672 to 675 in 01d3948

// Try to create a new upload
this.url = null
this._createUpload()
return

which creates a completely new upload with offset 0, but it seems that the ReadableStream is not reset and can already be way further advanced.

Hence, Requested data is before the reader's current offset necessarily triggers.

Automatically creating a new upload in the middle of the transfer seems logically impossible when the Upload is sourced by a ReadableStream, as the user is given no opportunity to create a stream on which another .slice(0, ...) can succeed.

@nh2 nh2 added the bug label Apr 14, 2023
@nh2
Copy link
Contributor Author

nh2 commented Apr 14, 2023

This issue is made more likely to crop up by issue #579, as that increases the chances that "a transmission error occurs and resumption fails".

@Acconut
Copy link
Member

Acconut commented Apr 14, 2023

Automatically creating a new upload in the middle of the transfer seems logically impossible when the Upload is sourced by a ReadableStream, as the user is given no opportunity to create a stream on which another .slice(0, ...) can succeed.

Yes, that is correct. There is currently no way how an upload could succeed from such kind of failures. The question is how can we improve this situation in tus-js-client?

a) Just raise a different error saying that we cannot resume in such a case
b) Provide some API so that the user can supply a new stream (but I am unsure how often this is actually possible).

@nh2
Copy link
Contributor Author

nh2 commented Apr 15, 2023

a) Just raise a different error saying that we cannot resume in such a case

Yes, I think the following would make sense:

  • Change the error message to explain that Readers cannot be resumed once they progress past the first chunkSize bytes.
  • Document in api.md:
    • In the Upload() constructor docs:
      • Mention that the logic of // Try to create a new upload exists. It is not obvious that tus-js-client may try to create a new upload after streaming had already started before, so the user may not even be aware of this situation (if you think thoroughly about it after reading the TUS spec, it becomes clear that there probably is some logic like that e.g. to handle Upload expiration, but it should best be explicitly mentioned).
      • Based on that, mention that this logic will fail once streaming has progressed past the first chunkSize bytes.
      • Mention that the user should prefer consider using the fileReader option in many cases instead of having to use a Reader. fileReader allows arbitrary offset slice()ing (not only forward streaming), so it is fully resumable.

b) Provide some API so that the user can supply a new stream (but I am unsure how often this is actually possible).

Isn't this theoretically already possible, by providing an arbitrary object as the argument to the Upload() constructor, and then implementing the fileReader option so that it gets the data from somewhere else (e.g. when slice(start, ...) observes start decrease back to 0, the user's slice() implementation could open some new source from somwhere)? See also #583 which I just filed.


As a concrete example of how fileReader can solve more problems than passsing a Reader to Upload():

I found a bug in Chrome "FileReader 5x slower than in Firefox over Windows Share". It limits sequential File read throughput to 20 Mbit/s. A workaround is concurrent batched reading of multiple File.slice()s. I first implemented it with a Reader but then hit this problem here. Only then did I find out that I can implement it with the fileReader option to get both custom IO patterns AND full resumability from offset 0.

@Acconut
Copy link
Member

Acconut commented Apr 16, 2023

  • Change the error message to explain that Readers cannot be resumed once they progress past the first chunkSize bytes.

I am wondering if we currently have the capabilities to detect this state inside the Upload class. I do not think so. If you have a file or buffer object, you can seek back to the beginning, so we cannot disallow it in all cases. And in the Upload class we do not directly have a property to detect streaming resources (yet). So maybe we need to expose this in the FileReader interface.

Isn't this theoretically already possible, by providing an arbitrary object as the argument to the Upload() constructor, and then implementing the fileReader option so that it gets the data from somewhere else (e.g. when slice(start, ...) observes start decrease back to 0, the user's slice() implementation could open some new source from somewhere)? See also #583 which I just filed.

Good point. That might be, although I never tried it. Plus I think this approach is a rather unintuitive way for most casual tus-js-client users. If we go and embrace a setup where a new stream can be supplied to tus-js-client, I would prefer it being done using a more explicit API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants