-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refs #37455 - Create Katello push repositories as needed at container push time #10995
base: master
Are you sure you want to change the base?
Conversation
The errors thrown are a little messy. They are all generic errors and they tend to get truncated on the podman side. I'm happy to improve them if you think it's a good idea. |
Containers with tags such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw this error on my first push:
container_repository_name: ["for repository 'fedora-bootc' is not unique and cannot be created in 'Library'. Its Container Repository Name (default_organization-buttermilk_biscuits-fedora-bootc) conflicts with an existing repository. Consider changing the Lifecycle Environment's Registry Name Pattern to something more specific."]
I do have another container repo, but its label is centos-bootc_fedora-bootc
The error didn't throw until later. Was repo creation attempted twice maybe?
I pushed:
podman push 75eca4dcedfb `hostname`/default_organization/buttermilk_biscuits/fedora-bootc
Well shoot. Podman calls the endpoint that calls Another option is that the |
I deleted all of my repos and tried again, ended up hitting:
|
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
I realize that we will definitely need some way to tell push repositories apart from normal ones. Relying on the version_href to say "container-push" in it could be a pretty good way to do so. When making content views, we'll need it to determine how contents should be copied. We'll also need it for deleting container push repositories, since it's the distribution that needs to be deleted rather than the repo itself. |
I missed this comment before. The tag is separate from the repository name -- it is tagging the manifest that the user is pushing up. Under the hood, if not tag is noted, podman will auto-assign latest to the manifest being pushed. I'm guessing in this case then that all pushed content is getting the latest tag. It's important that tags are taken in exactly as they're pushed. |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
I tested passing in custom tags and it works :)
|
ianballou and I talked about this outside of the PR and we came to the conclusion that we are handling tags correctly at the moment. Tag info shouldn't need to be stored anywhere in the repo; handling everything with Pulp should be completely fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird errors aside, pushing via name and via IDs are both creating repositories in the proper places 👍
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
As we discussed in-person before, the main thing missing now is being able to index content. To do so, I'd recommend a strategy like:
def container_push_repositories_api
PulpContainerClient::RepositoriesContainerPushApi.new(api_client)
end
# ...
repo.update(version_href: ::Katello::Pulp3::Repository.api(SmartProxy.pulp_primary, ::Katello::Repository::DOCKER_TYPE).
container_push_repositories_api.list(name: container_repo_name_here).results.first.latest_version_href)
# ... (Determined via https://www.rubydoc.info/gems/pulp_container_client/2.20.0/PulpContainerClient/RepositoriesContainerPushApi)
Create the ::Katello::Pulp3::RepositoryReference that would normally happen during the Pulp 3 repo creation. This is skipped because Pulp is making the repo for us. # pulp_repo_href looks like /pulp/api/v3/repositories/container/container-push/018f9759-7d55-7b1e-82b0-f2f062ebd606/
# Note that there is no version -- this is just the repository reference, not a reference to any repository version
RepositoryReference.where(
root_repository_id: repo.root_id,
content_view_id: repo.content_view.id,
repository_href: pulp_api_response.pulp_href).create!
Bonus points: Using a similar method as above, also find the distribution and save it as a -> API: |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
@ianballou I really appreciate all the help with content indexing above (too big to quote). The newest update to the PR should fix the duplicate Pulp entry, set the root repo's |
Already can see one improvement:
|
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
One thing I noticed is that errors are not easily recovered from -- in one case my push succeeded, but the repository remained and I had to do some cleanup before trying to push again. We could potentially deal with it in a later PR, depends on if it ends up being a bigger issue. |
I'm seeing some issues with bigger container images where the repository is created twice. I can reproduce the issue solidly with |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
@@ -85,6 +84,223 @@ def registry_authorize | |||
return false | |||
end | |||
|
|||
def handle_blob_push(props = nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's doing more than handling blobs -- maybe change the name to something like run_pre_push_checks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up splitting this into two before_actions: blob_push_validation
and create_container_repo_if_needed
. I think this will also make the lock implementation cleaner. Does that sound ok to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into if I can move the logic for testing if a container is needed into the before_action
line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was more that it's also handling pre-checks for manifests, so having "blob" in the name was a little misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does container_push_prop_validation
sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished up a functional test, was able to push the fedora-bootc image multiple times and the content all ended up in the same repository. It seems like the concurrency problem was solved, nice!
One issue I noticed -- pushing by IDs to a repository right after pushing by name created a different repository in Pulp. It didn't create another repository in Katello though, which is correct. As a result though, the content pushed by ID in Pulp appears to have disappeared in the Katello UI.
I was going to suggest translating the IDs to names before pushing content, but that would defeat the whole purpose of being able to differentiate the container names in Pulp.
To fix this, I think we'd need to know if a repository was originally pushed by name in order to decide which repo to push content to -- the "name" repo, or the "ID" repo.
If a user later creates an org / product that makes the naming ambiguous, they'll need to be able to keep pushing content to their original repositories that they pushed to by name. To do so, they'll need to start pushing by ID.
skip_before_action :authorize | ||
before_action :optional_authorize, only: [:token, :catalog] | ||
before_action :registry_authorize, except: [:token, :v1_search, :catalog] | ||
before_action :authorize_repository_read, only: [:pull_manifest, :tags_list] | ||
# TODO: authorize_repository_write commented out until Katello indexes Pulp container push repositories. | ||
# before_action :authorize_repository_write, only: [:start_upload_blob, :upload_blob, :finish_upload_blob, | ||
# :push_manifest] | ||
# before_action :authorize_repository_write, only: [:start_upload_blob, :upload_blob, :finish_upload_blob, :push_manifest] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you if you want to leave this for a second PR, I know this one is already so large.
@ianballou This newest push is very messy, sorry about that. I took one of your suggestions above and inverted it a little; pulp is now saving containers by id instead of name. I'm pretty sure there's a decent amount of work to do to fix this up on the container pull side, but push seems to work ok. What are your thoughts? Also: still working on the test cases. I think you can see my paltry progress in this push. |
I wonder if this is starting to get more complicated than we need, especially since this capitalization scenario we're guarding against should be very rare. I'm trying to think if we could completely throw out the "by ID" style to ensure that Katello and Pulp are always aligned. |
Throwing out the ID method entirely will force the rare ambiguous org / product owner to recreate the org / product. I think the safer middle ground would be to check if a user is trying to push content to a repository with a different name / ID scheme and to tell them to destroy it first. It makes for a decent experience for the very rare ambiguous org/product user, and shouldn't over-complicate the code for the 99% use-case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error messages are looking better! One issue I noticed:
- Push an image by name/label
- Create another org with the same name
- Create a product in the new org with the same name as the product in the old org
- Push the same image in step (1) to the product in step (3) by ID
- The ID repo gets created in Pulp, but no repository is created in the new org/product.
It looks like the Root repository was created under the new org. They do have the proper product association as well. However, neither of them have any Repository records.
The erroneous Root repositories also don't have any RepositoryReferences.
The first repository pushed by name is all correct, it seems.
Edit: I was able to push a new repo via ID to the new org, but it doesn't seem to have the version_href or distribution references set up.
I think in my Pulp the arianna repository already existed from a prior push.
-> It looks like this didn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things have changed a bit -- now I'm seeing:
Error: writing manifest: uploading manifest latest to centos9-katello-devel.manicotto.example.com/id/3/2/curl: blob upload unknown: Could not locate local uploaded repository for content indexing.
when trying to push the same container repository (curl) to a second org with the same name by ID.
However, if I push a brand new container repository, it seems to work just fine now.
-----This PR is not yet feature complete. This is likely broken in a few ways and still needs unit tests.-----
What are the changes introduced in this pull request?
Considerations taken when implementing this change?
TODO - section will change as PR changes
What are the testing steps for this pull request?