Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Storage API consistent and more useful #210

Open
activenode opened this issue Oct 22, 2022 · 10 comments
Open

Make Storage API consistent and more useful #210

activenode opened this issue Oct 22, 2022 · 10 comments
Labels
enhancement New feature or request

Comments

@activenode
Copy link

activenode commented Oct 22, 2022

Feature request

Problem(s)

  1. Currently when uploading via supabaseClient.storage.from(bucket).upload(filePath, file) the result is the Key which seems to be Key = $bucket/$path which quite honestly is extremely unhelpful. Why not return an object like { bucket_id, path, id } ? Especially since I couldn't find anything related to this Key - it's not a primary key as to my knowledge

  2. When deleting objects via storage.from...remove we actually do get a proper result returned (even though that's weird because for deletion I would only expect a minimal version, e.g. deletionCount or just the {id} or a boolean. -> hence: inconsistent as compared to my first point and not really easy to follow up to.

  3. The storage functions should be properly documented, not just being samples in the Templates (like storage.foldername(objects.name))

  4. This last one is an architectural question / issue: i feel like the objects table is holding too much of information considering the fact that someone who "simply should be able to view an image" shouldn't be able to retrieve the created_at or metadata column necessarily. That might be none of their business, also especially looking at owner and last_accessed_at -> have you thought about making the objects table more minimal like id, bucket, name and having a related table with the other data to separate concerns?

@activenode activenode added the enhancement New feature or request label Oct 22, 2022
@burmecia burmecia transferred this issue from supabase/supabase Oct 23, 2022
@fenos
Copy link
Contributor

fenos commented Nov 4, 2022

@activenode Hello!
Thanks a lot for your feedback!

Here are some of my thoughts regarding your suggestions:

  1. The reason we currently return the Key as the response is that all the information is already available in your input request, like bucketId etc... However, I agree that semantically would be better to follow a response, and having more resource-oriented fields returned could be nicer! We are currently planning to revise the returns fields of our responses and so I think this will be dealt with soon.

  2. This point is also covered above, we'll revise the response types

  3. The functions are documented here https://supabase.com/docs/guides/storage#storage.filename()

  4. Interesting point of view; The thing is that storage-api is an abstraction layer on top of a file storage system, for this reason, we are storing object information in a non-denormalised way, since you should be able to query this information as quick as possible.

The way i'll go about solving your concern would be to create a view over the object table and only include the fields that you want to expose to your clients (with appropriate RLS policies)

for example, the view could be called user_objects instead of objects which is our generic way to describe a file system, what do you think?

@activenode
Copy link
Author

1 + 2. Thanks, that would be nice
3. Perfect
4. Good approach but I'm not sure if it solves it because in my current understanding one needs Select access to the objects table implicitly to be able to access the files? Right or Wrong? If right then it would mean that even if I created a view (which isn't protectable via RLS and hence I would need the Secret Key since I would have to revoke all access to that view to disallow public access) the user would still need access to objects to be able to retrieve storage files. Right or wrong? Aside of that I'd agree with your idea.

@fenos
Copy link
Contributor

fenos commented Nov 4, 2022

1 + 2 + 3 . 🎉 Will keep you posted

  1. Yes, you are right, in order to apply RLS policies to a view they must be granted also to the table that the view is based on.

We are discussing this internally to see if we can find a clean way to blacklist certain columns depending on the user role once i hear something back from our Postgres team I will be able to tell you more

For now, one solution that is not as great in my perspective is to create a brand new table ex: user_objects, and set up triggers (insert, update, delete)

then you'll use this user_objects table for select purposes and the main table will be used for all other operations
I agree that is not the most elegant solution but it would achieve what you are looking for.

However, if we take a step back, we could reconsider the fact that metadata, created_at might not be "sensitive" information for a user that can select that item.

In the metadata fields there is information that could be extracted anyway from a response, like fileSize, mime etc...

We will probably deprecate last_accessed_at while revising the API response since this column is not as reliable as we'd like given the fact that we have a CDN in front of our backend and so most of the time we wouldn't update this column as we hit the cache.

the created_at column also seems OK to return. Maybe you are building a gallery and you might want to display when an item has been uploaded :)

@fenos
Copy link
Contributor

fenos commented Nov 4, 2022

Just received this link from our postgres team and they've told me that custom RLS on views are already available in PostgreSQL 15!!

https://www.depesz.com/2022/03/22/waiting-for-postgresql-15-add-support-for-security-invoker-views/

So the idea behind custom view with their own RLS policies will be possible as soon as we offer the upgrade to PG15 at Supabase

in the meantime, I've got on hold for next week to have a workaround shared to achieve the same with PG 13-14!!
Will post it here once i hear back

@activenode
Copy link
Author

Thanks, valid points mentioned, happy to hear about the RLS on Views, very much excited for that one. I've seen and read that exact same article and am happy to see the upgrade to PG15. 🙌

@activenode
Copy link
Author

One additional note: I am still trying to understand the thought processes behind the storage architecture. The public storages can be accessed publicly which is okay for the assets themselves since most often image names are being hashed and put into non-guessable directories such (that's a common CDN pattern).

Hence: In common CDN cases other users cannot "simply guess" files from other users.

However with supabase you don't have to guess because if the storage bucket is public then you can call client.storage.from('bucketname').list() and you have free access to all files ever uploaded.

This part I don't understand. What's the use-case?

Now for my client project I have a login. The login MUST have the anon key obviously. That means users with no access to the system could call .list() on the storage to get access to all "protected" files.

Now you could argue: "Well then don't make it public" but that is not a good option as well because the only way to currently deliver files from a private bucket is creating a signed url. I'd expected that getPublicUrl() would equally work and with a private bucket it would simply put a auth gate in front of it. But it doesn't and I need a signed, ever-changing url which means massive additional server load and massive additional load for the user (every time the users hits the page I need to create a new signed url for all images on that page that were uploaded).

Or in other words:

  • If you make the bucket public then you don't even have security through obscurity at all because a public bucket is essentially an API for everyone to list all files in that bucket without the possibility to constrain that
  • If you make it private you generate massive additional load for e.g. loading images from comment sections (random sample) because the image can't be cached on the client as the url changes on every page load (not static)

@activenode
Copy link
Author

activenode commented Nov 5, 2022

Take all of this with a grain of salt I might have something misconfigured and I'm questioning if what I said is true, if any. But is there a good description on the docs what is accessible with a public bucket and what not?

Though still would love to see something like getPrivateStaticUrl() or even getPublicUrl() to also work with private buckets but with a auth wall in front of it.

@activenode
Copy link
Author

Okay once more: I understood that if you do not have the select rights you cannot list the files in the bucket - which conceptually was hard to understand since the Policies imply that access to a bucket mean access to download/list etc. but apparently you can also give public buckets select rights such that everyone will still able to download but not everyone can list, correct?

@abhay187
Copy link

abhay187 commented Dec 28, 2022

Hi, @fenos I stumbled on this issue while looking for why the last_accessed_at column is not updating. From your comment, it looks like it is being deprecated. Can you give me confirmation whether it has been deprecated or not?

Thanks

@bombillazo
Copy link

bombillazo commented Feb 15, 2024

Any update on the response type returns update? I agree that having the full path requires us to parse the path into its useful parts since many operations rely on the bucket and "subpath." Also some operations do not return the ID of the changes object, just return a generic "success" message with no context. This is particularly useful since many Storage API functions work based on path and not ID, so getting a reference to the modified ID is required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants