Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent tag lookup #3890

Closed

Conversation

Antiarchitect
Copy link

I have no access to Enapter/distribution repo anymore so I've opened a new request.
@corhere Please check if I used errgroup correctly in the last commit. I cannot wrap allTags iteration into routine because it seems like errgroup.Wait() ends prematurely in this case resulting send to closed channel panic.

P.S. Seems I need help to move concurrency factor setup into common configuration module but it seems too complicated for me to understand how it works and what should I do.

go.mod Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented May 1, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.15 🎉

Comparison is base (29b5e79) 56.63% compared to head (4e9e8d1) 56.78%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3890      +/-   ##
==========================================
+ Coverage   56.63%   56.78%   +0.15%     
==========================================
  Files         106      106              
  Lines       10674    10701      +27     
==========================================
+ Hits         6045     6077      +32     
+ Misses       3955     3951       -4     
+ Partials      674      673       -1     
Impacted Files Coverage Δ
registry/storage/registry.go 88.11% <100.00%> (-0.18%) ⬇️
registry/storage/tagstore.go 83.53% <100.00%> (+7.41%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Collaborator

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot wrap allTags iteration into routine because it seems like errgroup.Wait() ends prematurely in this case resulting send to closed channel panic.

group.Wait() returns exactly when it is supposed to: after every goroutine spawned by group.Go() has returned. If you call group.Wait() from one goroutine concurrently with another goroutine calling group.Go(), you're going to have a bad time. You need to arrange to only wait for completion of the group after you have finished spawning all the tasks in the group.

Aside from the deadlock, you are using errgroup correctly.


For your convenience, I have copied all my review comments for tagstore_test.go from the previous PR.

registry/storage/tagstore.go Outdated Show resolved Hide resolved
registry/storage/tagstore.go Outdated Show resolved Hide resolved
registry/storage/tagstore.go Outdated Show resolved Hide resolved
registry/storage/tagstore_test.go Outdated Show resolved Hide resolved
@@ -121,20 +171,20 @@ func TestTagStoreUnTag(t *testing.T) {

func TestTagStoreAll(t *testing.T) {
env := testTagStore(t)
tagStore := env.ts
TagStore := env.ts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local variables should start with a lowercase letter.

Comment on lines +271 to +265
/// Should handle error looking up tag
env.mockDriver.GetContentError = errors.New("Lookup failure")

for i := 2; i < 15; i++ {
err = TagStore.Tag(ctx, strconv.Itoa(i), desc0)
if err != nil {
t.Fatal(err)
}
}

tags, err = TagStore.Lookup(ctx, desc0)
if err == nil {
t.Fatal("Expected error but none retrieved")
}
if len(tags) > 0 {
t.Errorf("Expected 0 tags on an error but got %d tags", len(tags))
}

// Should not error for a path not found
env.mockDriver.GetContentError = storagedriver.PathNotFoundError{}

tags, err = TagStore.Lookup(ctx, desc0)
if err != nil {
t.Fatal(err)
}
if len(tags) > 0 {
t.Errorf("Expected 0 tags on path not found but got %d tags", len(tags))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use subtests, not /// comments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to lookup for a good examples first. Tests are stolen, not know much about golang tests at all. Will investigate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// is a rust doc comment; Go is way simpler 😄


tags, err = TagStore.Lookup(ctx, desc0)
if err == nil {
t.Fatal("Expected error but none retrieved")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registry/storage/tagstore.go Outdated Show resolved Hide resolved
@corhere
Copy link
Collaborator

corhere commented May 1, 2023

P.S. Seems I need help to move concurrency factor setup into common configuration module but it seems too complicated for me to understand how it works and what should I do.

Fair enough. First let's decide where in the config structure that value should go, as that will have an impact on where and how to add the concurrency factor to the distribution configuration.

The concurrency factor, as currently implemented, sets an upper bound on the number of storage-backend operations which may be in flight concurrently to service a given (*TagStore).Lookup call. There is no global process-wide limit on the number of concurrent operations. Notably the gcs and filesystem drivers already implement a global concurrency limit via the base.NewRegulator decorator. Given there's a global limiter, how necessary is a per-operation limiter? What if we made it so that every driver could be decorated with the regulator, e.g. DISTRIBUTION_STORAGE_MAXCONCURRENCY=42?

If we do decide that a separate per-call limit is desirable, I think it should be defined as the upper bound on the number of storage-backend operations which may be in-flight concurrently in service of handling any one Distribution API client request or maintenance operation so that the same config value can also be applied elsewhere. It makes sense to me for that config value to be grouped under storage:, though I am at a bit of a loss as to what it should be named.

Thoughts @milosgajdos @thaJeztah @squizzi?

@Antiarchitect Antiarchitect force-pushed the concurrent-tag-lookup branch 3 times, most recently from 993ea04 to 4ff0c0e Compare May 2, 2023 07:26
@milosgajdos
Copy link
Member

Given there's a global limiter, how necessary is a per-operation limiter? What if we made it so that every driver could be decorated with the regulator, e.g. DISTRIBUTION_STORAGE_MAXCONCURRENCY=42?

The thing is Tag lookups are very "expensive" (in terms of latency) when it comes to driver stores packed with a huge number of tags so I think there is some place for it even within the context of a global Regulator possibly at the expense of other driver calls.

Ideally, we would combine both of them, whilst remaining runtime / operationally safe. I am not quite sure at the moment what that'd look like - we could go with adding per-call counter but that'd complicate the code 🤔

I need to have a think about this. Maybe @deleteriousEffect has some ideas around this, too

As for the config; if we do go for the latter option i.e. symbiosis b/w global and per-call limit I'd chuck it into

storage:
  tag:
    concurrency_limit: N # or maybe worker_limit?

@Antiarchitect
Copy link
Author

@Cohere @milosgajdos Let's summarize. What's left to do? Improve the tests and make the concurency factor configured in general manner? I would appreciate any help with this.

@@ -134,44 +153,67 @@ func (ts *tagStore) linkedBlobStore(ctx context.Context, tag string) *linkedBlob

// Lookup recovers a list of tags which refer to this digest. When a manifest is deleted by
// digest, tag entries which point to it need to be recovered to avoid dangling tags.
func (ts *tagStore) Lookup(ctx context.Context, desc distribution.Descriptor) ([]string, error) {
func (ts *TagStore) Lookup(ctx context.Context, desc distribution.Descriptor) ([]string, error) {
allTags, err := ts.All(ctx)
switch err.(type) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err!=nil {
  switch err.(type) { 
  .... 
 }
}

@wy65701436
Copy link
Collaborator

let's define an new struct like,

type tagRes struct {
tag string
matches bool
}


resChan := make(chan tagRes)
group, ctx := errgroup.WithContext(ctx)

for _, tag := range allTags {
        tag := tag
	group.Go(func() error {
               ...
              matches := (tagDigest == desc.Digest)
              resChan <- tagMatched{tag: tag, matches: matches}
              ...
        }
}

go func() {
	group.Wait()
	close(resultChan)
}()

var tags []string
for result := range resChan {
  if result.matches {
	  tags = append(tags, result.tag)
  }
}

if err := group.Wait(); err != nil {
return nil, err
}

return tags, nil

@microyahoo
Copy link
Contributor

Has there been any progress on this PR? Is there a plan for merging it? @Antiarchitect @corhere

@Antiarchitect
Copy link
Author

It's been a while since I've last worked on this MR. If I remember correctly there was something with the tests. I don't mind if someone more experienced in Go will take this initiative.

@github-actions github-actions bot added area/storage dependencies Pull requests that update a dependency file labels Apr 17, 2024
@microyahoo
Copy link
Contributor

It's been a while since I've last worked on this MR. If I remember correctly there was something with the tests. I don't mind if someone more experienced in Go will take this initiative.

hi @Antiarchitect, thanks for your contribution, I will take over it if you have no enough bandwidth.

@Antiarchitect
Copy link
Author

@microyahoo That would be nice. Thank you.

@Antiarchitect Antiarchitect force-pushed the concurrent-tag-lookup branch 2 times, most recently from 1a284c2 to 363a5bc Compare April 17, 2024 12:18
    Based on distribution#3589

Signed-off-by: Andrey Voronkov <voronkovaa@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 18, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
@Antiarchitect
Copy link
Author

Closing in favor of #4329 by @microyahoo

microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 23, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 24, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 25, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
microyahoo added a commit to microyahoo/distribution that referenced this pull request Apr 26, 2024
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in goharbor/harbor#12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR distribution#3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage dependencies Pull requests that update a dependency file enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants