Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage Collection provide a way to track progress #13154

Closed
thoro opened this issue Sep 24, 2020 · 10 comments
Closed

Garbage Collection provide a way to track progress #13154

thoro opened this issue Sep 24, 2020 · 10 comments
Assignees
Labels
area/gc kind/requirement New feature or idea on top of harbor

Comments

@thoro
Copy link
Contributor

thoro commented Sep 24, 2020

Is your feature request related to a problem? Please describe.
Once a GC job is started there is no information how far it is, or how long it will take.

Describe the solution you'd like
Provide a progress field that tells the user how far a job is, it might just be field like "50/5000" artifacts. This would help to estimate how long it will still take, and how far it already is. This is especially interesting since GC puts Harbor into a readonly mode(still?).

@wy65701436
Copy link
Contributor

hi @thoro , in Harbor v2.1, the gc changes to non-blocking and we provide a dry-run to give you how many blobs will be removed and the free up space.

@xaleeks we can consider to enhance the dry-run to give more infor, like the execution time estimation.

@thoro
Copy link
Contributor Author

thoro commented Sep 25, 2020

Oh, great, I'm already using 2.1! A current status on the live run would still be interesting.

FYI: My GC took 3 hours and freed up 200 GB

@wy65701436
Copy link
Contributor

wy65701436 commented Sep 25, 2020

Thanks for the feedback,
do you mean that it takes about 3 hours on the v2.1? what kind of storage are you using?

@thoro
Copy link
Contributor Author

thoro commented Sep 25, 2020

I have to correct myself, it took 1 hour 20 min, to free up 194 GB (Total ~460 GB), it was a total of 5853 blobs marked for deletion. The detection of the to be deleted blobs took around 7 seconds.

The requests to the registry:

2020-09-24T17:50:34Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:259]: delete the manifest with registry v2 API: frontends/cz3, application/vnd.docker.distribution.manifest.v2+json, sha256:cada4f1d6c46c2f5fd64471ecfa34a00d2dec580a09ca02bd78078dcf2003737

Took between 1-3 seconds each (per log dates). I see here a good option to parallize the deletion process, I do have multiple docker-registry instances to load balance, but all requests are serialized.

I'm using docker registry via S3 on minio (20 node cluster).

@wy65701436
Copy link
Contributor

wy65701436 commented Sep 25, 2020

thanks for your data, parallize the deletion is a good point. But, for the non-blocking, user can do any modification opertaion, pull/push/delete, so the performance may not be as important as it in the read-only mode.

Do you have a roughly execution time of GC on previous v2.1.0?

@thoro
Copy link
Contributor Author

thoro commented Sep 25, 2020

Yes, since it allows now modification operations thats already much better, I just read in the docs that it moves it to readonly...

Sadly not, I migrated from Harbor 1.8, and now used the retention rules to clean up all the garbage that collected. (So kudos for the retention rules ;) )

@reasonerjt
Copy link
Contributor

@thoro
I don't think the latest doc still says Harbor will move to readonly during GC?
And in short term, I suggest you monitor the log of gc job to track progress.

@thoro
Copy link
Contributor Author

thoro commented Sep 28, 2020

Probably I got redirected to an old version .. the search does tend todo that ...

For example if you search for GC, it gives you a 2.0.0 and a 1.10 version .. maybe the search should stay on the current version?

@reasonerjt
Copy link
Contributor

@thoro
Good point I opened an issue
goharbor/website#121

@steven-zou steven-zou added area/gc kind/requirement New feature or idea on top of harbor labels Oct 12, 2020
@wy65701436
Copy link
Contributor

Please re-open it if you still need this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/gc kind/requirement New feature or idea on top of harbor
Projects
None yet
Development

No branches or pull requests

5 participants