-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refs #37455 - Create Katello push repositories as needed at container push time #10995
base: master
Are you sure you want to change the base?
Conversation
The errors thrown are a little messy. They are all generic errors and they tend to get truncated on the podman side. I'm happy to improve them if you think it's a good idea. |
Containers with tags such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw this error on my first push:
container_repository_name: ["for repository 'fedora-bootc' is not unique and cannot be created in 'Library'. Its Container Repository Name (default_organization-buttermilk_biscuits-fedora-bootc) conflicts with an existing repository. Consider changing the Lifecycle Environment's Registry Name Pattern to something more specific."]
I do have another container repo, but its label is centos-bootc_fedora-bootc
The error didn't throw until later. Was repo creation attempted twice maybe?
I pushed:
podman push 75eca4dcedfb `hostname`/default_organization/buttermilk_biscuits/fedora-bootc
Well shoot. Podman calls the endpoint that calls Another option is that the |
I deleted all of my repos and tried again, ended up hitting:
|
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
I realize that we will definitely need some way to tell push repositories apart from normal ones. Relying on the version_href to say "container-push" in it could be a pretty good way to do so. When making content views, we'll need it to determine how contents should be copied. We'll also need it for deleting container push repositories, since it's the distribution that needs to be deleted rather than the repo itself. |
I missed this comment before. The tag is separate from the repository name -- it is tagging the manifest that the user is pushing up. Under the hood, if not tag is noted, podman will auto-assign latest to the manifest being pushed. I'm guessing in this case then that all pushed content is getting the latest tag. It's important that tags are taken in exactly as they're pushed. |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
I tested passing in custom tags and it works :)
|
ianballou and I talked about this outside of the PR and we came to the conclusion that we are handling tags correctly at the moment. Tag info shouldn't need to be stored anywhere in the repo; handling everything with Pulp should be completely fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird errors aside, pushing via name and via IDs are both creating repositories in the proper places 👍
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
As we discussed in-person before, the main thing missing now is being able to index content. To do so, I'd recommend a strategy like:
def container_push_repositories_api
PulpContainerClient::RepositoriesContainerPushApi.new(api_client)
end
# ...
repo.update(version_href: ::Katello::Pulp3::Repository.api(SmartProxy.pulp_primary, ::Katello::Repository::DOCKER_TYPE).
container_push_repositories_api.list(name: container_repo_name_here).results.first.latest_version_href)
# ... (Determined via https://www.rubydoc.info/gems/pulp_container_client/2.20.0/PulpContainerClient/RepositoriesContainerPushApi)
Create the ::Katello::Pulp3::RepositoryReference that would normally happen during the Pulp 3 repo creation. This is skipped because Pulp is making the repo for us. # pulp_repo_href looks like /pulp/api/v3/repositories/container/container-push/018f9759-7d55-7b1e-82b0-f2f062ebd606/
# Note that there is no version -- this is just the repository reference, not a reference to any repository version
RepositoryReference.where(
root_repository_id: repo.root_id,
content_view_id: repo.content_view.id,
repository_href: pulp_api_response.pulp_href).create!
Bonus points: Using a similar method as above, also find the distribution and save it as a -> API: |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
@ianballou I really appreciate all the help with content indexing above (too big to quote). The newest update to the PR should fix the duplicate Pulp entry, set the root repo's |
Already can see one improvement:
|
) unless root_repo.present? && instance_repo.present? | ||
|
||
api = ::Katello::Pulp3::Repository.api(SmartProxy.pulp_primary, ::Katello::Repository::DOCKER_TYPE).container_push_api | ||
latest_version_href = api.list(name: @container_path_input).results.first.latest_version_href |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api.list(name: @container_path_input).results.first
was nil for me when pushing via the ID scheme for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is interesting. @container_path_input should be exactly what the user entered. I'll look into it.
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
@@ -71,7 +71,7 @@ class RootRepository < Katello::Model | |||
validates_with Validators::ContainerImageNameValidator, :attributes => :docker_upstream_name, :allow_blank => true, :if => :docker? | |||
|
|||
validate :ensure_valid_docker_attributes, :if => :docker? | |||
validate :ensure_docker_repo_unprotected, :if => :docker? | |||
#validate :ensure_docker_repo_unprotected, :if => :docker? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to tomorrow Quinn: look into this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure we can completely remove the validation and the validation method.
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
One thing I noticed is that errors are not easily recovered from -- in one case my push succeeded, but the repository remained and I had to do some cleanup before trying to push again. We could potentially deal with it in a later PR, depends on if it ends up being a bigger issue. |
I'm seeing some issues with bigger container images where the repository is created twice. I can reproduce the issue solidly with |
app/controllers/katello/api/registry/registry_proxies_controller.rb
Outdated
Show resolved
Hide resolved
unprotected: true, | ||
is_container_push: true | ||
) | ||
sync_task(::Actions::Katello::Repository::CreateRoot, root, @container_path_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found an interesting issue here -- there seems to be a race condition.
Pushing content is concurrent. That means that multiple threads could be in this if statement at once time. So, ::Actions::Katello::Repository::CreateRoot
is being run even when @product.root_repositories.where(label: @container_name).empty?
turns false.
You can reproduce this in the console by sticking a debugger right before the sync_task call. Multiple different debuggers will run at once due to the concurrency, and they'll all queue up before the sync_task call.
This is tricky because we need a way for one of the concurrent calls from podman to become the "chosen one" and for the rest of the calls to wait. One idea that comes to mind would be to implement some sort of lock that is saved in the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed that using a single-threaded development environment "solves" the issue, so concurrency does seem to be the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea would be to use https://apidock.com/rails/v6.0.0/ActiveRecord/Relation/create_or_find_by
(or maybe create_or_find_by!
)
I wonder if it would be easier to stop using the CreateRoot action and instead create the database entries manually. We'd then be able to remove the conditionals from the current actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locking is potentially error-prone, so let's try that only as a last resort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for inspiration, https://api.rubyonrails.org/v7.1.3.2/classes/ActiveRecord/Relation.html#method-i-find_or_initialize_by is also available -- it would allow to instantiate the RootRepository object OR actually get a record if it exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea would be to use https://apidock.com/rails/v6.0.0/ActiveRecord/Relation/create_or_find_by
(or maybe
create_or_find_by!
)I wonder if it would be easier to stop using the CreateRoot action and instead create the database entries manually. We'd then be able to remove the conditionals from the current actions.
I've been experimenting with that for testing different approaches. I'm going to try and roll everything back into the foreman tasks unless that breaks concurrency, just so there isn't two different methods in Katello that do the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also thank you for the resources. Have been trying to get create_or_find_by to work but there are a lot of layers of foreman tasks to unroll.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of the newest push, I believe this is working correctly. I used create_or_find_by
inside of app/models/katello/glue/pulp/repos.rb
. When you have time, please confirm the issue is fixed on your end (I've been having trouble getting the endpoint to be hit simultaneously on my device).
@@ -85,6 +84,223 @@ def registry_authorize | |||
return false | |||
end | |||
|
|||
def handle_blob_push(props = nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's doing more than handling blobs -- maybe change the name to something like run_pre_push_checks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up splitting this into two before_actions: blob_push_validation
and create_container_repo_if_needed
. I think this will also make the lock implementation cleaner. Does that sound ok to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into if I can move the logic for testing if a container is needed into the before_action
line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was more that it's also handling pre-checks for manifests, so having "blob" in the name was a little misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does container_push_prop_validation
sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished up a functional test, was able to push the fedora-bootc image multiple times and the content all ended up in the same repository. It seems like the concurrency problem was solved, nice!
One issue I noticed -- pushing by IDs to a repository right after pushing by name created a different repository in Pulp. It didn't create another repository in Katello though, which is correct. As a result though, the content pushed by ID in Pulp appears to have disappeared in the Katello UI.
I was going to suggest translating the IDs to names before pushing content, but that would defeat the whole purpose of being able to differentiate the container names in Pulp.
To fix this, I think we'd need to know if a repository was originally pushed by name in order to decide which repo to push content to -- the "name" repo, or the "ID" repo.
If a user later creates an org / product that makes the naming ambiguous, they'll need to be able to keep pushing content to their original repositories that they pushed to by name. To do so, they'll need to start pushing by ID.
skip_before_action :authorize | ||
before_action :optional_authorize, only: [:token, :catalog] | ||
before_action :registry_authorize, except: [:token, :v1_search, :catalog] | ||
before_action :authorize_repository_read, only: [:pull_manifest, :tags_list] | ||
# TODO: authorize_repository_write commented out until Katello indexes Pulp container push repositories. | ||
# before_action :authorize_repository_write, only: [:start_upload_blob, :upload_blob, :finish_upload_blob, | ||
# :push_manifest] | ||
# before_action :authorize_repository_write, only: [:start_upload_blob, :upload_blob, :finish_upload_blob, :push_manifest] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you if you want to leave this for a second PR, I know this one is already so large.
@ianballou This newest push is very messy, sorry about that. I took one of your suggestions above and inverted it a little; pulp is now saving containers by id instead of name. I'm pretty sure there's a decent amount of work to do to fix this up on the container pull side, but push seems to work ok. What are your thoughts? Also: still working on the test cases. I think you can see my paltry progress in this push. |
I wonder if this is starting to get more complicated than we need, especially since this capitalization scenario we're guarding against should be very rare. I'm trying to think if we could completely throw out the "by ID" style to ensure that Katello and Pulp are always aligned. |
Throwing out the ID method entirely will force the rare ambiguous org / product owner to recreate the org / product. I think the safer middle ground would be to check if a user is trying to push content to a repository with a different name / ID scheme and to tell them to destroy it first. It makes for a decent experience for the very rare ambiguous org/product user, and shouldn't over-complicate the code for the 99% use-case. |
-----This PR is not yet feature complete. This is likely broken in a few ways and still needs unit tests.-----
What are the changes introduced in this pull request?
Considerations taken when implementing this change?
TODO - section will change as PR changes
What are the testing steps for this pull request?