Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix diffparam_volume and add diffparam_mount and diffparam_tmpfs #448

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

rubiksdot
Copy link
Contributor

This is a fix for bug #355.

This compares the arguments podman recieved for the currently existing
container with the (effective) arguments to come.

This approach was taken over parsing Mounts from inspect because:

  1. This line from https://github.com/containers/podman/blob/e084f0ee1e1ded564110e84eefca9f18b5669adf/libpod/container_inspect.go#L224
    regarding inspect's Mount value:

    "Only includes user-specified mounts. Only includes bind mounts and named volumes, not tmpfs volumes."

    Thus an incomplete solution would result.

  2. The code required to parse so that it could be compared with the
    stuff to come was getting silly-level complex.

  3. Anonymous volumes were impossible to decipher as both Name and
    Source are filled in with the podman-generated values.

Thus we compare the arguments podman create was run with to make the
existing container with the closest values to those arguments in the
new config.

This resulted in simpler code that takes care of the issue of anonymous
volumes.

The downside is that if someone moves, say, a tmpfs from mount to tmpfs
(or vice versa) that would reult in exactly the same result this will
be considered a different config. This can (possibly) be fixed if and
when it becomes an actual issue.

Signed-off-by: Andrew rubiksdot@grue.cc

Andrew added 2 commits July 10, 2022 16:20
This is a fix for bug containers#355.

This compares the arguments podman recieved for the currently existing
container with the (effective) arguments to come.

This approach was taken over parsing Mounts from inspect because:

1. This line from https://github.com/containers/podman/blob/e084f0ee1e1ded564110e84eefca9f18b5669adf/libpod/container_inspect.go#L224
   regarding inspect's Mount value:

   "Only includes user-specified mounts. Only includes bind mounts and named volumes, not tmpfs volumes."

   Thus an incomplete solution would result.

2. The code required to parse so that it could be compared with the
   stuff to come was getting silly-level complex.

3. Anonymous volumes were impossible to decipher as both Name and
   Source are filled in with the podman-generated values.

Thus we compare the arguments podman create was run with to make the
existing container with the closest values to those arguments in the
new config.

This resulted in simpler code that takes care of the issue of anonymous
volumes.

The downside is that if someone moves, say, a tmpfs from mount to tmpfs
(or vice versa) that would reult in exactly the same result this will
be considered a different config. This can (possibly) be fixed if and
when it becomes an actual issue.

Signed-off-by: Andrew <rubiksdot@grue.cc>
The old code removed unnecessary slashes via the _clean_volume()
function. I accidentally got rid of it and have re-introduced it
here as self._clean_path().

The rename is because it is used outside of mere check of volumes
but the internal code is as before.

Signed-off-by: Andrew <rubiksdot@grue.cc>
@rubiksdot
Copy link
Contributor Author

Ok. Still getting 2 failures but I'm not sure how. Is it getting its volumes from containers.conf?

Would love to see the values of self.params and self.info for the fail. :)

@sshnaidm
Copy link
Member

If you can see logs of the failing job, there is a difference in runs:

--- before
+++ after
@@ -1 +1 @@
-volume - ['/', ':', 'd', 'e', 'i', 'm', 'o', 'p', 'r', 's', 't']
+volume - ['/opt/://somedir/']

As you see it takes string as a list.

Began sorting in self._clean_path_in_mount_str() as an early
defense against ordering of mount arguments changing. This steels
against false comparison errors later.

Fixed lots of embarrassing string in list context errors. Oops! :(

Signed-off-by: Andrew <rubiksdot@grue.cc>
@rubiksdot
Copy link
Contributor Author

Totally missed that fail. Thanks. Obviously not in a good state to have been coding last night. :) This now leaves the one I WAS looking at and that comes up with

--- before
+++ after
@@ -1 +1 @@
-user - user2
+user - user

...

STDERR:
time="2022-07-11T01:51:15Z" level=warning msg="The cgroupv2 manager is set to systemd but there is no systemd user session available"
time="2022-07-11T01:51:15Z" level=warning msg="For using systemd, you may need to login using an user session"
time="2022-07-11T01:51:15Z" level=warning msg="Alternatively, you can enable lingering with: `loginctl enable-linger 1001` (possibly as root)"
time="2022-07-11T01:51:15Z" level=warning msg="Falling back to --cgroup-manager=cgroupfs"

And that has me lost, still. :(

@sshnaidm
Copy link
Member

sshnaidm commented Jul 11, 2022

What I see now it's failure in test: tests/integration/targets/podman_container_idempotency/tasks/idem_volumes.yml:58

- containers.podman.podman_container:
    image: "{{ idem_image }}"
    name: idempotency
    state: present
    volumes:
      - /opt/://somedir/
    command: 1h
  register: test5

and its result is:

--- before
+++ after
@@ -1 +1 @@
-volume - ['/opt:/somedir/']
+volume - ['/opt/://somedir/']

Probably you need to strip backslashes from the beginning of string too? Also need to remove ending slashes from somedir/ too.

@sshnaidm
Copy link
Member

In general would be great if you add a few tests for mounts, tmpfs and other functionality you add in this PR.

Made _clean_path() squash multiple consecutive slashes into one as
3+ slashes are just as meaningful as 1.

Slash handling is now done properly diffparam_volume().

Signed-off-by: Andrew <rubiksdot@grue.cc>
@rubiksdot
Copy link
Contributor Author

In general would be great if you add a few tests for mounts, tmpfs and other functionality you add in this PR.

That's a fair call and I was (really!). Normally it takes me ages to finish some coding as I tend to test as I go but this time I took too many shortcuts so whilst I was testing all three I wasn't testing them well. :(

My apologies for the commit spam and added hassle.

Hopefully this shall be the final commit to this PR. I may die if embarrassment if not. :/

@rubiksdot
Copy link
Contributor Author

RIP. :(

Ok. The lint tests are 2 spaces (gah!) and I've got those but I don't want to submit yet another patch until I have the others sorted.

I've looked at test9 that is failing. My understanding of it is that it is there to clear all mounts associated with the container and given that tests 7 and 8 add and change volumes ,test9 should change the container. This would be consistent with test6 (as I understand it). Test9, though, asserts that the container should not be changed.

Is my reasoning wrong and test9 is not asserting correctly or do I have yet another bug in my code? In my own tests doing the exact same thing removes all volumes and so ansible declares the task as changed.

@sshnaidm
Copy link
Member

@rubiksdot so test9 task shouldn't be changed because I use a specific image for this kind of tests, and it has some "volumes" defined in its Docketfile:

That's why running container with /data shouldn't change anything, because it's already in image.
Sorry for that mess, we had bugs when container was recreated again and again, while it has built image with specified volumes.

@rubiksdot
Copy link
Contributor Author

All good. I'm starting to think anonymous volumes are evil. :)

So if I have the logic right: since these are anonymous volumes and since they are minty fresh with each restart of a container (because new ones get auto-created), changing a mount that is an anonymous volume to an anonymous volume should not trigger a container change because the effect of doing so is no different to the effect of not doing so?

And here we have anonymous volumes created via VOLUME in Dockerfile and --volume via CLI and these are no different to each other in functionality.

@rubiksdot
Copy link
Contributor Author

Ok. So I've been working on this and took it a step further so that you can move equivalent mounts between tmpfs, mount and volume.

The downside I'm finding as I am debugging it is that it makes self.diff not match the arguments so it feels very dirty.

Is there interest in going down this path? If so does the diff need to match the arguments? I'm thinking yes so I'm thinking of how to allow the movement without messing up the diff (too much ;).

@sshnaidm
Copy link
Member

Ok. So I've been working on this and took it a step further so that you can move equivalent mounts between tmpfs, mount and volume.

The downside I'm finding as I am debugging it is that it makes self.diff not match the arguments so it feels very dirty.

Is there interest in going down this path? If so does the diff need to match the arguments? I'm thinking yes so I'm thinking of how to allow the movement without messing up the diff (too much ;).

Sorry, what is the issue? Do you mean keywords for self._diff_update_and_compare? If so, I don't think it's a big problem. But still need to keep in mind predefined volumes in image.

@rubiksdot
Copy link
Contributor Author

Haven't dropped this. Just RL got busy and I now came across a bug that requires a bit of a rewrite so still working on it, RL permitting.

rubiksdot and others added 2 commits September 29, 2022 11:22
Combine diffparam_mount(), diffparam_tmpfs() and diffparam_volume()
and add handling of VOLUME into one function that allows for movement
of definitions between the various ways of specifying mount points
as long as the options are equivalent.

This necessitated translations of the various ways of defining
mount points into one single "language" so that the definitions
may be compared and checked for similarity. prep_mount_args_for_comp()
does this work and includes within it the default options for the
various CLI arguments and VOLUME.

The consequence of this is that the diff returned by the function
no longer literally represents the real-life arguments passed but,
rather, the translations of those arguments.

Signed-off-by: Andrew <rubiksdot@grue.cc>
@rubiksdot
Copy link
Contributor Author

I tried to test this patch set before submitting it by doing:

$ TEST2RUN="podman_container_idempotency" bash ./ci/run_containers_tests.sh

but it kept bombing out with

... line 1223, in is_different\n File "/tmp/ansible_containers.podman.podman_container_payload_ga1uklio/ansible_containers.podman.podman_container_payload.zip/ansible_collections/containers/podman/plugins/module_utils/podman/podman_container_lib.py", line 887, in diffparam_log_level\nKeyError: 'exitcommand'\n"

So if there's another error like before my apologies ahead of time.

Fixes:

ERROR: Found 7 pylint issue(s) which need to be resolved:
ERROR: plugins/module_utils/podman/podman_container_lib.py:1288:37: ansible-format-automatic-specification: Format string contains automatic field numbering specification
...
ERROR: plugins/module_utils/podman/podman_container_lib.py:1400:27: ansible-format-automatic-specification: Format string contains automatic field numbering specification

also:

File "/tmp/ansible_containers.podman.podman_container_payload_6atuqv63/ansible_containers.podman.podman_container_payload.zip/ansible_collections/containers/podman/plugins/module_utils/podman/podman_container_lib.py", line 1405, in diffparam_volumes_all
KeyError: 'config'

Signed-off-by: Andrew <rubiksdot@grue.cc>
@rubiksdot
Copy link
Contributor Author

Gah. Format errors happened to be the one sanity test skipped cos pylint was not available/applicable to 3.9 (which is what I have). Figures. :/

@rubiksdot
Copy link
Contributor Author

Regarding the remaining fail. I can't see why the fail is correct. Reasoning below:

Given the following configs for the various tests:

VOLUME ["/data", "/data2"] +
test1:
test2:
test3:
test4:  /opt:/somedir/
test5:  /opt/://somedir/
test6:
test7:  /opt:/somedir
        /data
test8:  /data
test9:
test10: /opt:/anotherdir
        local_volume1:/data
test11: /opt//:/anotherdir
        local_volume1:/data/
test12: /opt:/anotherdir
        local_volume2:/data
test13: /opt:/anotherdir

I believe the mounted volumes would be:

test1:  /data
        /data2
test2:  /data
        /data2
test3:  /data
        /data2
test4:  /data
        /data2
        /opt:/somedir
test5:  /data
        /data2
        /opt:/somedir
test6:  /data
        /data2
test7:  /data
        /data2
        /opt:/somedir
test8:  /data
        /data2
test9:  /data
        /data2
test10: /opt:/anotherdir
        local_volume1:/data
        /data2
test11: /opt:/anotherdir
        local_volume1:/data
        /data2
test12: /opt:/anotherdir
        local_volume2:/data
        /data2
test13: /opt:/anotherdir
        /data
        /data2

So the move from test12 to test13 would be a change of /data from a named volume to an anonymous volume, no?

@sshnaidm
Copy link
Member

@rubiksdot sorry for late reply. Yes, you're correct, it's done intentionally - I don't support currently local volumes idempotency because there is no way to determine from container info whether it's anonymous volume or not.
This is anonymous mount:

            {                                                                                                                                     
                "Destination": "/data2",                                                                                                          
                "Driver": "local",                                                                                                                
                "Mode": "",                                                                                                                       
                "Name": "e24db8ad6c51564c899637cd37aa9bf0ec5fac97181fd843fc2ae695114fc9f2",                                                       
                "Options": [                                                                                                                      
                    "nodev",                                                                                                                      
                    "exec",                                                                                                                       
                    "nosuid",                                                                                                                     
                    "rbind"                                                                                                                       
                ],                                                                                                                                
                "Propagation": "rprivate",                                                                                                        
                "RW": true,                                                                                                                       
                "Source": "/home/sshnaidm/.local/share/containers/storage/volume/e24db8ad6c51564c899637cd37aa9bf0ec5fac97181fd843fc2ae695114fc9f2/_data",                                                                                                                                          
                "Type": "volume"                                                                                                                  
            },

and this is created named volume:

{                                                                                                                                     
                "Destination": "/data",                                                                                                           
                "Driver": "local",                                                                                                                
                "Mode": "",                                                                                                                       
                "Name": "local_volume2",                                 
                "Options": [                                                                                                                      
                    "nosuid",                                                                                                                     
                    "nodev",                                                                                                                      
                    "rbind"                                                                                                                       
                ],                                                                                                                                
                "Propagation": "rprivate",                                                                                                        
                "RW": true,                                                                                                                       
                "Source": "/home/sshnaidm/.local/share/containers/storage/volumes/local_volume2/_data",                                           
                "Type": "volume"                                                                                                                  
            },   

if you create a volume and name it e24db8ad6c51564c899637cd37aa9bf0ec5fac97181fd843fc2ae695114fc9f2 it will be same. So while it's not clear how to determine this I leave it like not idempotent, if you want to remount - you should to recreate the container manually.

@rubiksdot
Copy link
Contributor Author

@rubiksdot sorry for late reply. Yes, you're correct, it's done intentionally - I don't support currently local volumes idempotency because there is no way to determine from container info whether it's anonymous volume or not. This is anonymous mount:

...

if you create a volume and name it e24db8ad6c51564c899637cd37aa9bf0ec5fac97181fd843fc2ae695114fc9f2 it will be same. So while it's not clear how to determine this I leave it like not idempotent, if you want to remount - you should to recreate the container manually.

I spotted this also and found that the past and present arguments are available (and hopefully reliable). So I rewrote my code using the lack of source to decide if an anon (vs named) volume is being requested. In my tests this has worked nicely so if you don't see an issue with the methodology I used this test can be flipped (or removed).

@sshnaidm
Copy link
Member

@rubiksdot if you fix it, then of course test can be fixed as well. Let me review your change, will take a few. Thanks!

@rubiksdot
Copy link
Contributor Author

Hey. Should I look into that or is that something from your end?

(and did that branch merge go in the right direction? :)

@sshnaidm
Copy link
Member

@rubiksdot Sorry for so long time, this patch is not a small one and required more attention 🙂 , I wasn't able to take care of it lately.
Because we have here a lot of options I think it's worth to test it thoroughly rather with unittests than by integration tests. I prepared a framework to run tests and a few tests examples in https://github.com/containers/ansible-podman-collections/pull/556/files#diff-d5b04a15ce9ac85f16c4564b6ed5c451d760c417a1ec7426d2531eaa98b21ec8
Please let me know what you think.
We can either to copy these tests to the current patch and continue to add them here, either to merge it as is and continue to add the in the next patch. I'd like to help adding tests as well.

@rubiksdot
Copy link
Contributor Author

No worries at all. It is a big patch and it affects a lot. I manually tested it until I got tired of it so not surprised it's taking a while to review. :)

ATM I'm rather bogged down with RL but I hope that to clear in a week or two so wont be able to touch this until then.

I'd love to do automated testing and I tried doing so before I sent the last patch in so as not to keep messing up this PR but failed to run them on my local box. From VAGUE memory it was pulling in a copy of APC from github and testing that rather than testing my unpublished changes but I wouldn't stake my life on this memory.

So if I can get help with getting that going when the time comes I will be happy to allocate time to getting automated tests going for it (and getting help in getting them written will also be cool :).

@oddomatik
Copy link

Any updates on this? Thanks for any work here, this issue seems to basically be a show-stopper for ansible-podman integration in our environment.

@rubiksdot
Copy link
Contributor Author

If I can get some help getting podman's testing environment working with my own code I'll be happy to:

a. update the patch-set to eliminate bitrot
b. add extra tests to deal with the new code.

@Macleykun
Copy link

@sshnaidm would you be able to help rubiksdot with his question?

@sshnaidm
Copy link
Member

@rubiksdot @oddomatik @Macleykun and others, I have started to create unittests for this method in #556 . The usual way to run the unittests is with pytest.
To run integration tests I just use it in a playbook as:

include_tasks: /path/to/ansible-podman-collections/tests/integration/targets/podman_volume_info/tasks/main.yml

The PR also includes some fixes for the original code. You can take it as a base and add another unittests or integration tests. I think with this kind of change we'd better to focus on unittests for this particular function since it's hard to test all variations with Ansible tasks and not sure all mount type will be possible to do.
Please add more tests and let's see it doesn't break idempotency because it's an awesome patch, but it's huge and requires a testing.

@sshnaidm
Copy link
Member

@rubiksdot @oddomatik @Macleykun please see if #750 helped here.

@rubiksdot
Copy link
Contributor Author

I'm going to see if I can find the time to unbitrot this (#750 looks like it'll make that fun but I'll see) and then see if I can figure out how to test things locally and automatically so that I don't burn out testing and don't pollute the change further.

@sshnaidm
Copy link
Member

sshnaidm commented Jun 2, 2024

I'm going to see if I can find the time to unbitrot this (#750 looks like it'll make that fun but I'll see) and then see if I can figure out how to test things locally and automatically so that I don't burn out testing and don't pollute the change further.

I just run the task from the testing role:

- hosts: all
  tasks:
    - include_tasks: /path/to/tests/integration/targets/podman_volume/tasks/main.yml

@Macleykun
Copy link

I'm going to see if I can find the time to unbitrot this (#750 looks like it'll make that fun but I'll see) and then see if I can figure out how to test things locally and automatically so that I don't burn out testing and don't pollute the change further.

I just run the task from the testing role:

- hosts: all
  tasks:
    - include_tasks: /path/to/tests/integration/targets/podman_volume/tasks/main.yml

personally run this playbook on a test machine, but everything checks out!
I was looking for a few days where and why i had the same issue with the volume from here, but i seem to not be able to reproduce that same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants