Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] support loading layer data from S3 #2914

Open
james-willis opened this issue Mar 11, 2024 · 5 comments
Open

[Feature] support loading layer data from S3 #2914

james-willis opened this issue Mar 11, 2024 · 5 comments

Comments

@james-willis
Copy link

james-willis commented Mar 11, 2024

Loading data from S3 is a common use case today. With loaders.gl it is simple enough to perform this task against public data as it is easy to provide an https URL to an s3 object. Against single-object datasets a signed URL can be generated to pass to the loader.

However, for tile datasets where a URL template is provided, the signed URL will vary with each tile. I'd like to request a feature in loaders.gl that adds broad support for s3 urls, supporting both public and private objects.

Users will pass the S3 url, as well as potentially passing S3ClientConfig for private objects.

I have mocked up a potential implementation approach, but I'm unsure on how passing credentials should work and iff interface changes should be made.

Related request in deck.gl: visgl/deck.gl#8590

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 12, 2024

I am assuming you want to use the loaders in the browser against a private S3 bucket? Normally signing of URLs happens on the backend. I.e. the browser must request signed URLs from the backend via an API. So we need an async callback where you can do that lookup.

@james-willis
Copy link
Author

james-willis commented Mar 12, 2024

I am assuming you want to use the loaders in the browser against a private S3 bucket?

Yes. I've edited my original post to call them signed urls rather than presigned URLs for clarity.

Signing URLs is just a mechanism through which this feature could be implemented.

Signing urls can be generated anywhere that the aws credentials with read access are available. I don't believe there is any need to call back to do a lookup; signed url calculation is a pure function of the url and the credentials

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 12, 2024

signed url calculation is a pure function of the url and the credentials

Yes but normally you don't want to send credentials to the front-end. If you are willing to send the credentials to the front-end then someone can intercept them and start signing URLs to anything in your bucket (perhaps another customer's data, if you store data from multiple customers in the same bucket).

My question is basically, are we designing for the harder case, when the app developer is not willing to send credentials to the client and is standing up a custom url signing endpoint on their backend. Or is this for the smaller subset of apps that are willing to send signing credentials to their front-end?

@james-willis
Copy link
Author

james-willis commented Mar 13, 2024

My perspective comes from someone who primarily is using loaders.gl to back pydeck or jupyter-keplergl. In those cases the front end usually has access to the credential that are available on the backend already.

For these usecases I would prefer that those credentials be leveraged. I think calling to the backend for each request adds unneeded complexity and latency for these usecases.

Generally, I am imagining cases where the end user has some credentials to provide to the application in order to access the private data.

@ibgreen
Copy link
Collaborator

ibgreen commented Mar 13, 2024

My perspective comes from someone who primarily is using loaders.gl to back pydeck or jupyter-keplergl

That is helpful, if you are in Python it means that it is harder to override JS callbacks. Your design seems to indicate that you have the option to install extra JS packages.

My biggest objection to current proposal is that it is building "knowledge" about S3 into loaders.gl/core (i.e. there is a function namedfetchS3 function). It needs to be done through a more abstract, pluggable model...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants