Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new createrepo_c API to simplify code #3461

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dralley
Copy link
Contributor

@dralley dralley commented Mar 11, 2024

[noissue]

@dralley dralley force-pushed the createrepo-new-api branch 2 times, most recently from 363da96 to 9477131 Compare March 11, 2024 22:38
@github-actions github-actions bot added the multi-commit Added when a PR consists of more than one commit label Mar 11, 2024
@dralley dralley marked this pull request as draft March 11, 2024 23:37
@dralley
Copy link
Contributor Author

dralley commented Mar 11, 2024

Found a bug, blocked: rpm-software-management/createrepo_c#426

general = checksum_types.get("general")
metadata = checksum_types.get("metadata")
# fallback order
checksum_type = general or metadata or original.get(name) or default
checksum_type = general or metadata or default
Copy link
Contributor Author

@dralley dralley Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to ditch this "original checksum" logic. I think the idea was to publish using the same checksums that the metadata had when it was synced. I don't know that it's particularly valuable now that we've started phasing out the legacy checksum types. The code was added back in 2020

In any event, it's a bit silly to do that for the checksum but not the compression type

@ggainey Can you think of any particular reason why we might want to prefer the original checksum types over the default of sha256?

Or maybe we shouldn't get rid of this entirely, but we can merge it into last_sync_details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first, to make sure I'm understanding what we're doing here - original_checksum_types holds the type-of-checksum that repomd.xml declared each metadata file was summed-with, as of last-sync. Since we always "have" the metadata files post-sync, no matter what mode was used, we will always have sha256 available.

This is only used when publishing, correct?

If we're doing a full-mirror-sync, we don't recreate the metadata files - we just use whatever is there.

On the one hand - it's probably safe to remove this. On the other - the cost of carrying it is low, and every single time we have "cleaned up" code involving "old checksum decisions nobody will ever need in Pulp again", we've ended up having to put it back.

So my take is, "the code works as-is and the benefit from cleaning it up is...minimal". I'd leave it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost in this case is that the API we want to use doesn't support picking the checksum on a per file basis. So it's either get rid of it or don't do this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that def changes the math :) Using the new API is demonstrably preferable, not-using "original" is unlikely to impact us. If we take your initial uggestion (store into last_sync_details), then we have the info available if anyone cares (or, more important.y, we decide we have to "fix" something...)

@github-actions github-actions bot removed the multi-commit Added when a PR consists of more than one commit label May 6, 2024
@github-actions github-actions bot added the multi-commit Added when a PR consists of more than one commit label May 9, 2024
@github-actions github-actions bot removed the multi-commit Added when a PR consists of more than one commit label May 9, 2024
@dralley dralley force-pushed the createrepo-new-api branch 6 times, most recently from 96f5c49 to 302bb10 Compare May 10, 2024 20:54
@github-actions github-actions bot added the multi-commit Added when a PR consists of more than one commit label May 30, 2024
@dralley dralley requested a review from ggainey May 30, 2024 02:45
@dralley dralley marked this pull request as ready for review May 30, 2024 02:45
The new API can't handle it - it's also largely not relevant anymore.

[noissue]
Zero-downtime migrations make things difficult

[noissue]
@dralley dralley marked this pull request as draft May 30, 2024 13:48
@dralley dralley removed the request for review from ggainey May 30, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-commit Added when a PR consists of more than one commit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants