Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix large exports failing #5388

Merged
merged 1 commit into from
May 23, 2024
Merged

Fix large exports failing #5388

merged 1 commit into from
May 23, 2024

Conversation

hstct
Copy link
Contributor

@hstct hstct commented May 15, 2024

closes #5375

@hstct hstct force-pushed the fix_large_exports branch 2 times, most recently from 5570469 to 9d8db20 Compare May 15, 2024 16:03
@hstct
Copy link
Contributor Author

hstct commented May 15, 2024

This is a still in progress work. It does not account for cases with other storage backends. And the solution for the string operation of the artifact paths is still hackish.

However it did successfully export a large RPM content view with more than 65k packages which previously failed.

@quba42 quba42 force-pushed the fix_large_exports branch 2 times, most recently from c64dace to 57cd9d4 Compare May 16, 2024 14:02
@quba42 quba42 force-pushed the fix_large_exports branch 2 times, most recently from 4d03dc0 to ce0c796 Compare May 16, 2024 15:12
@hstct hstct marked this pull request as ready for review May 17, 2024 08:34
quba42 added a commit to ATIX-AG/pulpcore that referenced this pull request May 22, 2024
closes OR-4701

Upstream PR: pulp#5388

closes pulp#5375

Co-authored-by: Tobias Grigo <56518487+hstct@users.noreply.github.com>
dralley
dralley previously approved these changes May 22, 2024
Copy link
Contributor

@dralley dralley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable!

We need some more extensive import/export testing, it's just difficult to set up.

Let's squash the commits and get one additional review

@dralley dralley enabled auto-merge (rebase) May 22, 2024 18:29
@dralley dralley requested a review from ipanova May 22, 2024 18:55
auto-merge was automatically disabled May 23, 2024 08:05

Head branch was pushed to by a user without write access

batch = artifact_pks[offset : offset + EXPORT_BATCH_SIZE]
batch_qs = Artifact.objects.filter(pk__in=batch).only("file")

for artifact in pb.iter(batch_qs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use iterator() here as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this? Not sure what difference it makes.

I did re-test the large export with the extra .iterator().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading that, .iterator() sounds like the right choice for batch query sets. Thanks for the link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it actually mattered in this instance as the default batch size is 2000, and we're already doing external batching... with a batch size of 2000

But it doesn't hurt anything so no need to revert.

closes pulp#5375

Co-authored-by: Tobias Grigo <56518487+hstct@users.noreply.github.com>
@dralley dralley enabled auto-merge (rebase) May 23, 2024 13:23
@dralley dralley merged commit f3277fe into pulp:main May 23, 2024
16 checks passed
Copy link

patchback bot commented May 23, 2024

Backport to 3.39: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.39/f3277fe971a8014eb96d8e54a8b58671fe71ba3b/pr-5388

Backported as #5416

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link

patchback bot commented May 23, 2024

Backport to 3.49: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.49/f3277fe971a8014eb96d8e54a8b58671fe71ba3b/pr-5388

Backported as #5418

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Export of large amount of packages will result in errors
5 participants