Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(blob): multipart upload #71

Merged
merged 20 commits into from
May 31, 2024
Merged

feat(blob): multipart upload #71

merged 20 commits into from
May 31, 2024

Conversation

Teages
Copy link
Contributor

@Teages Teages commented Apr 12, 2024

Resolves #66

This pull request added multipart upload api and its proxy route.

Also fixed a type error on proxyHubBlob, it breaks the type check

did some check in playground, not check the proxy api yet.

Checked in remote proxy, works well except the contentType not included in the complete response (but it returns in dev server), maybe it takes time for r2 to determine the type

Copy link
Contributor

@farnabaz farnabaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering how we can change API routes to remove query params.

Just an idea:

  • /blob/multipart/[...pathname].post.ts create a multip part upload
  • /blob/multipart/[...pathname]/[uploadId]/[partNumber].put.ts put a part
  • /blob/multipart/[...pathname]/[uploadId]/complete.post.ts complete upload task

src/runtime/server/api/_hub/blob/mpu/index.post.ts Outdated Show resolved Hide resolved
src/runtime/server/api/_hub/blob/mpu/[...pathname].put.ts Outdated Show resolved Hide resolved
@farnabaz
Copy link
Contributor

Although that upload works fine, I have this wired error in terminal

Screenshot 2024-04-12 at 17 05 48

@Teages
Copy link
Contributor Author

Teages commented Apr 12, 2024

Although that upload works fine, I have this wired error in terminal

Screenshot 2024-04-12 at 17 05 48

it is from the wroked, as their member said:

This message is not really an error, at least not one that's relevant to most people. This is another example where our logging is highlighting bugs meant for the workers runtime team to look into. In this case V8's API for mapping JITed memory in order to construct stack traces isn't working the way we expect. This has been firing occasionally for a long time. There's actually no impact for workerd users since this information is only used when generating crash dumps in our internal codebase.

so we can safity ignore this

@Teages
Copy link
Contributor Author

Teages commented Apr 12, 2024

I was wondering how we can change API routes to remove query params.

Just an idea:

  • /blob/multipart/[...pathname].post.ts create a multip part upload
  • /blob/multipart/[...pathname]/[uploadId]/[partNumber].put.ts put a part
  • /blob/multipart/[...pathname]/[uploadId]/complete.post.ts complete upload task

sadily radix3 doesn't support this route

import { createRouter } from "radix3"

const router = createRouter()

router.insert('/api/blob/multipart/**:pathname/', { payload: 'not this' })
router.insert('/api/blob/multipart/**:pathname/:uploadId/:partNumber', { payload: 'found' })

const res = router.lookup('/api/blob/multipart/hello.txt/id/1')

console.log(res)

returns

{
  payload: "not this",
  params: {
    pathname: "hello.txt/id/1",
  },
}

so I just make create / complete in different route

@Atinux
Copy link
Contributor

Atinux commented Apr 17, 2024

Thank you so much @Teages

I think to move forward, I would like to have a multipart: true option for hubBlob().put() as automatic multipart:

await hubBlob().put('movie.mp4', file, {
  multipart: true
})

As well as updating the documentation for it.

As a bonus, it would be nice to find a way to have a Vue composable as well:

useUpload(`/api/blob/multipart/${file.name}`).upload(file, { multipart: true })

@Teages
Copy link
Contributor Author

Teages commented Apr 17, 2024

I think to move forward, I would like to have a multipart: true option for hubBlob().put() as automatic multipart:

await hubBlob().put('movie.mp4', file, {
  multipart: true
})

I think it maybe not not be very useful, hubBlob runs on nitro, which is running in cloudflare worker, it should have the most stable connection to cloudflare r2.

The limitation of direct upload seems to be 300mb, but woker only have 128mb memory. I can't see any need for workers to upload to r2 in multipart.

It should be the same for other platforms.

maybe use multi-threaded uploading can be faster? I don’t know

As a bonus, it would be nice to find a way to have a Vue composable as well:

useUpload(`/api/blob/multipart/${file.name}`).upload(file, { multipart: true })

Yes I think so, but it affects flexibility, I didn't see nuxt hub have provided any upload api except hub proxy.

Or we can provide a composable like this:

const { progress } = useMultipartUpload({
  file,
  partSize: 10 * 1024 * 1024, // 10MB
  throttle: 2, // 2 concurrent uploads
}, ({partNumber, body}) => $fetch())

Copy link
Contributor

Atinux commented Apr 19, 2024

I think it maybe not not be very useful, hubBlob runs on nitro, which is running in cloudflare worker, it should have the most stable connection to cloudflare r2.

Makes sense, let's skip it.

What about:

const { progress, isComplete } = useMultipartUpload(file, {
  partSize: 10 * 1024 * 1024, // 10MB
  throttle: 2, // 2 concurrent uploads
}, ({ partNumber, chunkBody }) => $fetch())

It would be nice to have the second parameter as optional:

const { progress, isComplete } = useMultipartUpload(file, ({ partNumber, chunkBody }) => $fetch())

So it could use defaults we can define, wdyt?

@Teages
Copy link
Contributor Author

Teages commented Apr 21, 2024

well I designed a composable creater:

export const useMultipartUpload = createMultipartUploader({
  create: (file) => $fetch(),
  upload: ({ partNumber, chunkBody }, { pathname, uploadId }) => $fetch(),
  complete: (parts, { pathname, uploadId }) => $fetch(),
  abort: ({ pathname, uploadId }) => $fetch(),
  partSize: 10 * 1024 * 1024, // 10MB
  concurrent: 2, // 2 concurrent uploads
  maxRetry: 3,
})

then upload file in just two lines:

const { completed, progress, cancel } = useMultipartUpload(file)
const result = await completed

@Atinux
Copy link
Contributor

Atinux commented Apr 25, 2024

Thank you so much for your work, we are getting close!

I've been thinking, what about using the action as first part?

/api/blob/multipart/[action]/[...pathname].ts

action can be: create, upload, complete and abort

We could have an abstraction similar to this:

// server/api/blob/multipart/[action]/[...pathname].ts
export default eventHandler(async (event) => {
  await hubBlob().handleMultipartUpload(event, {
    // options...
    addRandomSuffix: true
  })
})

This will avoid creating 4 files to handle multipart upload, and it should also simplify the createMultipartUploader function, I would rename it to useMultipartUpload instead:

const upload = useMulipartUpload('/api/blob/multipart', {
  partSize: 10 * 1024 * 1024, // 10MB
  concurrent: 2, // 2 concurrent uploads
  maxRetry: 3,
})
const { progress, completed, abort } = upload(file)

Also, would you mind updating the progress to be between 0 and 100 similar to https://nuxt.com/docs/api/composables/use-loading-indicator#progress ?

@Teages
Copy link
Contributor Author

Teages commented Apr 25, 2024

That would be great, what options do we need for handleMultipartUpload? I would like to provide some hooks like beforeCreate, afterCompleted and etc, but we can let developers do themself like

// server/api/blob/multipart/[action]/[...pathname].ts
export default eventHandler(async (event) => {
  const actions = getRouterParam(event, 'action')
  // do something before handle
  return await hubBlob().handleMultipartUpload(event)
})

and most upload need authentication, I want to allow overwrite ofetch options of useMulipartUpload,

interface UseMulipartUploadOptions {
  //... some else
  fetchOptions: Omit<FetchOptions, 'body' | 'parseResponse' | 'responseType'>
}

Copy link
Contributor

Atinux commented Apr 25, 2024

I would like to provide some hooks like beforeCreate, afterCompleted and etc,

The goal of exposing a function is that they can do whatever they want before and after the function instead of exposing an API route :)

I like the fetchOptions property for useMultipartUpload 🚀

@Atinux
Copy link
Contributor

Atinux commented Apr 26, 2024

You are very fast @Teages 🚀

Happy to resolve the conflicts? 🙏

@Teages
Copy link
Contributor Author

Teages commented Apr 26, 2024

oops seems I messed up, I need some time to solve

@Teages
Copy link
Contributor Author

Teages commented Apr 26, 2024

ok it looks good now

@Teages
Copy link
Contributor Author

Teages commented May 17, 2024

I noticed #99, everything looks fine except the way to resolve returning is different.
In this pr I use respondWith to force the response to be the handled response, but in #99 it just return the response and lets the user return on their own.

Do I need to change to the same approach?

Copy link
Contributor

Atinux commented May 24, 2024

cc @farnabaz

@Atinux Atinux merged commit eedc944 into nuxt-hub:main May 31, 2024
4 checks passed
@Teages Teages deleted the feat/mpu branch May 31, 2024 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multipart API support
3 participants