Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display total registry size in GB #334

Open
jbyerline opened this issue Sep 6, 2023 · 3 comments
Open

Display total registry size in GB #334

jbyerline opened this issue Sep 6, 2023 · 3 comments

Comments

@jbyerline
Copy link

Is your feature request related to a problem? Please describe.
It is nice to know when I am approaching storage limits of my registry

Describe the solution you'd like
I would like a simple text or graphic to display how much space all my images are taking up.
It would be cool to see overall registry size as well as size by each image section.

@Joxit
Copy link
Owner

Joxit commented Sep 10, 2023

Hi, thank you for using my project and submitting ideas.
Unfortunately, having the used storage of your registry will not be accurate and is too dangerous/will cause perf issues.

  • Why this will be inaccurate ?

    • Docker and his registry are using layers, that means many images will share the same layers. That means if you have 100 custom images based on the same 100Mo base image with your 1Mo app, the UI may show 100 * 101Mo = 10.1Go but the real usage would be 100Mo + 100 * 1 Mo = 200Mo...
  • Why this is dangerous or will cause perf issues ?

    • In order to have the size of an image, you need to query the associated tag. That mean for your full catalog, you will must query all the images to get their tags then query all the tags to get their size. This could break your registry or the UI.
  • Do I have alternatives ?

    • Yes the first one, I can add this feature in the taglist view only, but this will not solve the inaccurate size issue... And it will sum only the current page, so if you have to many tags, you will have a part of the size...
    • The second alternative is a side project which will take care of this, like enhancing docker registry with a new API. This one is a totally new project and may take some time without guarantees.

If you're interested in the second alternative, let me know

@julianfoad
Copy link

I came here looking at Docker Registry UI hoping to find a total size summary. This is the kind of thing I expected to find in a "simplest and most complete UI." I have been running Docker Registry Browser, which doesn't have this feature, and I find that it is not a very useful browser for me at all. I rarely want to browse the images and their details. Pretty much the only things I want to know are a summary of the overall storage, and if that's high then I want to be able to drill down into which images are excessively huge, and which ones have not been recently used so I can choose to delete them.

Hoping to add clarity to the answer above.

  • For the size calculation, AFAIK adding the sizes of the unique image layers (deduplicated by their SHA hashes) gives the correct answer.
  • "Dangerous": I have 19 images currently in my registry, 16 of them have 1 tag ("latest"), 2 have 2 tags, 1 has 4 tags. That's 24 tags in total, so 25 requests to query all the data. That won't break my registry. I'm pretty sure querying all the tags wouldn't break my registry even if I had a million or more.
  • "Perf issues": Querying all the tags in a huge repo would take a long time. I don't know how long one query takes, but as an example if it's a few milliseconds and they aren't parallelised that would limit the usability to around 1000 tags. But surely this project is not designed or intended for a huge repo like docker's or github's, it's intended for simple private scenarios?
  • Querying all the tags in a huge repo, and trying to collate all the response data in memory in the docker-registry-ui server component, could exhaust available memory. For a million tags (say 10,000 images averaging 100 tags each), the storage of (SHA:size) pairs, one per layer, at say 100 bytes each and 10 layers per tag, might take over 1 GB. So it should be limited and able to fail gracefully in case of extreme repo size. Again, surely it's intended for simple private scenarios where this would not be a problem?
  • "Breaking the UI": It doesn't make sense to attempt this querying and collation in the client web app: that would be too slow with client-server network latency over lots of queries. The UI should show a "working..." indication if the server takes a long time to complete its calculation, and be able to show the failure if the repo is so big that the calculation exhausts the memory allocation. It need not break.

Then there's the second alternative, the option of making the server component fetch the size/usage information through a new API. This could indeed be a new API added to the docker registry API, as suggested. As a simpler alternative, however, it could also be a simple admin-provided API (such as a script that is run) that returns the equivalent of "df" command output (disk free space: total, used, free) on the repository storage device.

@Joxit
Copy link
Owner

Joxit commented Oct 23, 2023

Hi, thank you for using my project and your feedback 😄

To be clear, "simplest and most complete UI" means it's a UI only project using all APIs from docker registry servers. It's simple because it's only static files served by NGINX, no need of a JVM or nodeJS or to bind the folder of your registry or anything... So if the official docker registry do not support something, I cannot bring it to the project. And it's the case for the registry size.

When I'm building Open Source Projects as this one, I'm always thinking of my users and what they could have.

  • For example if you are checking the issue v2/_category is short. #39, this user is using my project with a repository of 100k+ images.
  • Or if you are checking the issue support pagination #36, this user is using my project with a repository with a few images, but they have thousands of tags.
  • The last but not least, I'm the first user of my project, I have a registry with 100 images and 700 tags....

So my project must work on all those cases, I don't know if we can say a repo of +100k of images or thousands of tags by images are big repo or not, I will let you choose 😄

Don't get me wrong, this could be a nice addition and I wish to have it, but for now, I will not add Θ(m*n) requests to a project that is supposed to be "simple" each time we go to the home page. I may do some tests just in case but without warranties.

Maybe with some sponsorship I could work on a side API project. We need to remember that the registry server support many storage: filesystem, s3, gcs and azure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants