Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GC performance] The performance of v2 manifest deletion is not good in S3 environment #12948

Open
wy65701436 opened this issue Sep 2, 2020 · 41 comments

Comments

@wy65701436
Copy link
Contributor

wy65701436 commented Sep 2, 2020

In S3 backend environment, we found that it took about 39 seconds to delete a manifest via v2 API.

[why still use v2 to handle manifest deletion]

As Harbor cannot know the tags belong to the manifest in the storage, the GC job needs to leverage the v2 API to clean them. But, the v2 API will look up all of tags, and remove them one by one. This may cause performance issue.

[what we can do next]

1, Investigate how many requests send to S3 storage within the v2 manifest deletion.
2, Investigate the possibility of not to store the first tag in the backend, then GC job can skip this step.

Log

Sep 1 12:56:35 192.168.144.1 registry[1146]: time="2020-09-01T12:56:35.750530108Z" level=info msg="authorized request" go.version=go1.13.8 http.request.host="registry:5000" http.request.id=c9a4d5ad-4157-4091-a023-93d8e20a5746 http.request.method=DELETE http.request.remoteaddr="192.168.144.9:44072" http.request.uri="/v2/library/testingg/manifests/sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" http.request.useragent=harbor-registry-client vars.name="library/testingg" 
vars.reference="sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" 
Sep 1 12:57:14 192.168.144.1 registry[1146]: time="2020-09-01T12:57:14.340710966Z" level=info msg="response completed" go.version=go1.13.8 http.request.host="registry:5000" http.request.id=c9a4d5ad-4157-4091-a023-93d8e20a5746 http.request.method=DELETE http.request.remoteaddr="192.168.144.9:44072" http.request.uri="/v2/library/testingg/manifests/sha256:20f39c20df7c5605f77862b711c3d28731e4d569171ec852ce34a06432611faa" http.request.useragent=harbor-registry-client http.response.duration=38.598453034s http.response.status=202 http.response.written=0 
@dkulchinsky
Copy link
Contributor

Hey @wy65701436, following our chat in Slack I'd like to share similar performance issue we're experiencing with a GCS storage backend.

We're running Harbor v2.1.1 and we replicated a GCR registry content to Harbor, however we forgot to exclude a repo that had at the time >60,000 tags.

After replication completed we deleted the repo in Harbor and ran GC, but the job keeps failing due to timeout to delete manifest:

2020-11-03T16:44:21Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:259]: delete the manifest with registry v2 API: fastly/demo-go-app, application/vnd.docker.distribution.manifest.v2+json, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c

2020-11-03T17:14:21Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:262]: failed to delete manifest with v2 API, fastly/demo-go-app, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c, Delete "http://harbor-registry:5000/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2020-11-03T17:14:21Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:165]: failed to execute GC job at sweep phase, error: failed to delete manifest with v2 API: fastly/demo-go-app, sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c: Delete "http://harbor-registry:5000/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

looking at the registry logs we see that it takes over an hour to delete a manifest:

[harbor-registry-b4fbbb8df-xcgt4 registry] 127.0.0.1 - - [03/Nov/2020:16:44:21 +0000] "DELETE /v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c HTTP/1.1" 202 0 "" "harbor-registry-client"

[harbor-registry-b4fbbb8df-xcgt4 registry] time="2020-11-03T17:56:14.798548519Z" level=info msg="response completed" go.version=go1.14.7 http.request.host="harbor-registry:5000" http.request.id=3b01c244-6b14-49ba-bfde-1bfd7934c15e http.request.method=DELETE http.request.remoteaddr="127.0.0.1:59854" http.request.uri="/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c" http.request.useragent=harbor-registry-client http.response.duration=1h11m53.181188629s http.response.status=202 http.response.written=0

We enabled debug log in registry and we saw it was spending most of the time iterating though the tags with gcs.GetContent:

❯ grep GetContent harbor-registry-debug-logs.txt|grep demo-go-app|wc -l
   55525

for example:

[harbor-registry-b4fbbb8df-xcgt4 registry] time="2020-11-03T16:44:27.205475289Z" level=debug msg="gcs.GetContent("/docker/registry/v2/repositories/fastly/demo-go-app/_manifests/tags/0023fd10bee5f3cc968e55148169091eb7d1cf795a8780ee7508642ab047042b/current/link")" auth.user.name="harbor_registry_user" go.version=go1.14.7 http.request.host="harbor-registry:5000" http.request.id=3b01c244-6b14-49ba-bfde-1bfd7934c15e http.request.method=DELETE http.request.remoteaddr="127.0.0.1:59854" http.request.uri="/v2/fastly/demo-go-app/manifests/sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c" http.request.useragent=harbor-registry-client trace.duration=110.978462ms trace.file="/go/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).GetContent" trace.id=6374bc55-1500-4fb9-bcf0-f53da9f5fa16 trace.line=95 vars.name="fastly/demo-go-app" vars.reference="sha256:0f20ddd7f417cae5d685a3ea653617241135644575744db58cf5091dcd6cdf5c"

I can share the full GC job and registry debug logs if needed, also happy to provide more information.

@guyguy333
Copy link

Experiencing exactly the same issue as @dkulchinsky. We're unable to end a GC. It always end with a context deadline exceeded. We've more than 1Tb to clean (~130k objects). We can't resume so we have to start from scratch again.

@dkulchinsky
Copy link
Contributor

dkulchinsky commented Oct 21, 2021

Hello friends, I'd like to ask to raise the priority of this issue.

We are running several instances of Harbor (we use GCS backend, but I think the root cause here is the same) and we're rapidly growing our usage.

We are starting to reach capacities that the GC simply cannot handle, repositories with more than a few thousand tags is taking ~2 minutes to delete a single manifest during GC, GC is now taking 10~14 hours and the problem is getting worse every day since we're adding more tags then we are deleting.

on our test/certification Harbor instance we've reached over 20,000 tags on some repositories and GC just times out on the first manifest since the lookup takes >20 minutes.

We are concerned about increasing our storage costs since we can't clean it up as well as other potential issues that may arise from having all these blobs/manifests lingering with no ability to properly clean them up.

this issue was tagged as a candidate for v2.2.0, and we're already seeing v2.4.0 going out the door.

I'm happy to provide additional context, information, logs but just hope we can have some attention on this issues since I think it will impact any user that needs Harbor to work at scale.

/cc @wy65701436 @reasonerjt

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Stale label Apr 16, 2022
@dkulchinsky
Copy link
Contributor

I believe this is still an active issue being tracked, so probably shouldn't get closed yet.

@stale stale bot removed the Stale label Apr 17, 2022
@github-actions
Copy link

github-actions bot commented Jul 7, 2022

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Jul 7, 2022
@sidewinder12s
Copy link

This is still an issue.

@github-actions github-actions bot removed the Stale label Jul 8, 2022
@github-actions
Copy link

github-actions bot commented Sep 7, 2022

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Sep 7, 2022
@rcjames
Copy link

rcjames commented Sep 7, 2022

This is still an issue.

@dkulchinsky
Copy link
Contributor

@wy65701436 any hints on when the team may find some time to look at this? this seems like an issue that requires desperate attention however it didn't see any traction in over 2 years now.

@github-actions github-actions bot removed the Stale label Sep 8, 2022
@twhiteman
Copy link
Contributor

Just to further explain the crux of this issue.

Harbor is using the docker distribution for it's (harbor-registry) registry component.

The harbor GC will call into the docker registry to delete the manifest, which in turn then does a lookup for all tags that reference the deleted manifest:
https://github.com/distribution/distribution/blob/78b9c98c5c31c30d74f9acb7d96f98552f2cf78f/registry/handlers/manifests.go#L536

To find the tag references, the docker registry will iterate every tag in the repository and read it's link file to check if it matches the deleted manifest (i.e. to see if uses the same sha256 digest):
https://github.com/distribution/distribution/blob/78b9c98c5c31c30d74f9acb7d96f98552f2cf78f/registry/storage/tagstore.go#L160

So, the more tags you have in your repository, the worse the performance will be (as there will be more s3 API calls occurring for the tag directory lookups and tag file reads).

@github-actions
Copy link

github-actions bot commented Jan 2, 2023

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

@github-actions github-actions bot added the Stale label Jan 2, 2023
@captn3m0
Copy link

captn3m0 commented Jan 2, 2023

not stale.

@karmicdude
Copy link

karmicdude commented Mar 15, 2023

Yes, we use the minio-operator and minio tenant was created under harbor registry. Now the drives are networked, with the specifications listed above. You can see that there is a problem, when GC starts, i/o wait ~450ms - a lot.
image
Also reading from the disk is stuck at the upper limit of IOPS (320 by current disks specifications).

We will test a similar setup on fast disks (ssd or nvme) with the same data made from snapshots to see how the situation changes when GC runs. I will post when I have some results. But it is unlikely to improve the issue much.
If the situation gets much better, we always have the option to deploy minio on dedicated server with high-performance local disks.

@dkulchinsky
Copy link
Contributor

dkulchinsky commented Mar 21, 2023

@Antiarchitect, wanted to share some progress

Looks very promising @Antiarchitect, I think in my case the issue is compounded by:

  1. the slow and repeat handling of 404 for manifest delete
  2. GCS library from 2015

I was able to build distribution with your PR, distribution/distribution#3702 (update GCS drive to latest) and the Redis sentinel patch.

I also figured out the issue around the retry loop when a 404 is returned during manifest delete and have a fix for Harbor jobservice here: #18386

I'm running a test now with the above in our sandbox Harbor and although it's not breaking any speed records, it is looking much better, will update once I have more concrete numbers.

@hemanth132
Copy link

hemanth132 commented May 12, 2023

When using s3 for storage, during manifest delete, s3 look-up using s3 list and get calls is performed to check all the tags referencing this manifest. For a large repository having lakhs of images, all these tags are read to delete 1 single image.

Instead of relying on distribution to delete the tag folder, since Harbor already maintains the tags referencing a manifest, we can introduce a new api in distribution to delete the tag directory altogether when deleting the artifact and skip tag lookup during the garbage collection step.

I was able to test this out and brought down the registry size from 360TB to less than 30TB by configuring 30 days of retention period. Checkout these prs:
hemanth132#1
hemanth132#2
hemanth132/distribution#2

Have done a couple of optimizations like:

  1. introducing configurable concurrency in GC
  2. Add concurrency limit for retention to avoid db CPU spikes.
  3. Comment out deletion of orphan blobs not associated with any artifact in garbage collection step since that rarely happens. Can uncomment this back after all the old images are deleted.

I made the changes against the version 2.7.0 code of Harbor. Now that this is working, will try to raise pr in the main harbor repo in the coming weeks.

Please let me know if this helps for you.

@dkulchinsky
Copy link
Contributor

Hey @hemanth132, just a quick shout out that this looks very promising and thanks for your effort.

Would love to see these changes make it into Harbor so we can finally run GC 😅

/cc @Vad1mo, this is related to our conversation in Slack earlier.

@Vad1mo Vad1mo added the never-stale Do not stale label Jun 13, 2023
@Vad1mo
Copy link
Member

Vad1mo commented Jun 14, 2023

@wy65701436, take a look #12948 (comment)

@sebglon
Copy link

sebglon commented Nov 14, 2023

Any update?

@jwojnarowicz
Copy link

Has anyone tested @hemanth132 solution? Or are there any updates from Harbor team regarding the API or GC? @Vad1mo @chlins Bump because it's still an important issue regarding usage with every S3 backend.

@karmicdude
Copy link

Any update? Still causing a huge pain, GC works out slower than data is added, resulting in having to constantly extend disks

microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
@microyahoo
Copy link
Contributor

hi @karmicdude, I have taken over @Antiarchitect's efforts with concurrent lookup and untag in PR distribution/distribution#4329. You can try it and check whether it has improvement, thanks.

microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this issue Apr 26, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
@microyahoo
Copy link
Contributor

hi @karmicdude, I have taken over @Antiarchitect's efforts with concurrent lookup and untag in PR distribution/distribution#4329. You can try it and check whether it has improvement, thanks.

hi @karmicdude, @sebglon, @jwojnarowicz @sidewinder12s , distribution/distribution#4329 has already merged, would you please help to try whether it meets expectations.

@karmicdude
Copy link

Nice, I'll definitely check it out

@snowmanstark
Copy link

@wy65701436 @Vad1mo Can we have this change in v2.11.1 as this will improve the GC efficiency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests