Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxc-top does not work with openrc and cgroup2 (Alpine Linux) #4376

Open
3 tasks done
ncopa opened this issue Dec 22, 2023 · 30 comments · May be fixed by #4439
Open
3 tasks done

lxc-top does not work with openrc and cgroup2 (Alpine Linux) #4376

ncopa opened this issue Dec 22, 2023 · 30 comments · May be fixed by #4439
Assignees
Labels
Bug Confirmed to be a bug Easy Good for new contributors

Comments

@ncopa
Copy link
Contributor

ncopa commented Dec 22, 2023

The template below is mostly useful for bug reports and support questions.
Feel free to remove anything which doesn't apply to you and add more information where it makes sense.

Required information

  • Distribution: Alpine Linux
  • Distribution version: 3.19.0
  • The output of
    • lxc-start --version: 5.0.3
    • lxc-checkconfig:
LXC version 5.0.3                                                       
                                                                        
--- Namespaces ---                                                      
Namespaces: enabled                                                     
Utsname namespace: enabled                                              
Ipc namespace: enabled                                                  
Pid namespace: enabled                                                  
User namespace: enabled                                                 
newuidmap is not installed                                              
newgidmap is not installed                                              
Network namespace: enabled

--- Control groups ---                                                  
Cgroups: enabled                                                        
Cgroup namespace: enabled                                               
Cgroup v1 mount points:                                                 
Cgroup v2 mount points:                                                 
 - /sys/fs/cgroup                                                       
Cgroup device: enabled                                                  
Cgroup sched: enabled                                                   
Cgroup cpu account: enabled                                             
Cgroup memory controller: enabled                                       
Cgroup cpuset: enabled                                                  
                                                                        
--- Misc ---                                                            
Veth pair device: enabled, loaded                                       
Macvlan: enabled, not loaded                                            
Vlan: enabled, not loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, loaded
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, not loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled
File capabilities: enabled

Note : Before booting a new kernel, you can check its configuration
usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,noexec,relatime,size=10240k,nr_inodes=252290,mode=755,inode64 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,inode64 0 0
/dev/sr0 /media/cdrom iso9660 ro,relatime,nojoliet,check=s,map=n,blocksize=2048,iocharset=utf8 0 0
tmpfs / tmpfs rw,relatime,mode=755,inode64 0 0
tmpfs /run tmpfs rw,nosuid,nodev,size=405928k,nr_inodes=819200,mode=755,inode64 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
/dev/loop0 /.modloop squashfs ro,relatime,errors=continue 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/debug/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
none /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

Issue description

lxc-top stopped working after OpenRC switched to use unified cgroup (cgroup2). this comes as the default in alpine 3.19.0.

This is the output of lxc-top -b:

time_ms,container,cpu_nanos,cpu_sys_userhz,cpu_user_userhz,blkio_bytes,blkio_iops,mem_used_bytes,memsw_used_bytes,kernel_mem_used_bytes
Unable to read cgroup item memory.usage_in_bytes
Unable to read cgroup item memory.limit_in_bytes
Unable to read cgroup item memory.memsw.usage_in_bytes
Unable to read cgroup item memory.memsw.limit_in_bytes
Unable to read cgroup item memory.kmem.usage_in_bytes
Unable to read cgroup item memory.kmem.limit_in_bytes
Unable to read cgroup item cpuacct.usage
Unable to read cgroup item cpuacct.stat
Unable to read cgroup item cpuacct.stat
Unable to read cgroup item blkio.throttle.io_service_bytes
Unable to read cgroup item blkio.throttle.io_serviced

Downstream bug report: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15607

Steps to reproduce

  1. Start alpine in qemu: qemu-system-x86_64 -m 2048 -accel kvm -cdrom https://dl-cdn.alpinelinux.org/alpine/v3.19/releases/x86_64/alpine-virt-3.19.0- x86_64.iso -nographic
  2. login as root and execute setup-alpine -q to set up basic network apk repositories.
  3. install packages: apk add iptables lxc lxc-bridge lxc-templates-legacy-alpine
  4. start base services: service cgroups start && service dnsmasq.lxcbr0 start
  5. create a simple lxc container: lxc-create -t alpine -n a1
  6. start the container: lxc-start -n a1
  7. run lxc-top -b

Information to attach

  • any relevant kernel output (dmesg)
[  136.939216] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need t.
[  190.764549] lxc-create[2436]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set
[  226.589710] lxcbr0: port 1(vethXOfaJf) entered blocking state
[  226.589712] lxcbr0: port 1(vethXOfaJf) entered disabled state
  • container log (The file from running lxc-start -n <c> -l TRACE -o <logfile> ) a1.trace.log
  • the containers configuration file a1.config.txt
@ncopa
Copy link
Contributor Author

ncopa commented Dec 22, 2023

alpine:~# ls /sys/fs/cgroup/lxc.payload.a1/
cgroup.controllers        cpuset.cpus.partition     memory.max
cgroup.events             cpuset.mems               memory.min
cgroup.freeze             cpuset.mems.effective     memory.oom.group
cgroup.kill               crond                     memory.peak
cgroup.max.depth          hugetlb.2MB.current       memory.reclaim
cgroup.max.descendants    hugetlb.2MB.events        memory.stat
cgroup.procs              hugetlb.2MB.events.local  memory.swap.current
cgroup.stat               hugetlb.2MB.max           memory.swap.events
cgroup.subtree_control    hugetlb.2MB.numa_stat     memory.swap.high
cgroup.threads            hugetlb.2MB.rsvd.current  memory.swap.max
cgroup.type               hugetlb.2MB.rsvd.max      memory.swap.peak
cpu.idle                  io.bfq.weight             memory.zswap.current
cpu.max                   io.latency                memory.zswap.max
cpu.max.burst             io.max                    networking
cpu.stat                  io.stat                   pids.current
cpu.stat.local            memory.current            pids.events
cpu.weight                memory.events             pids.max
cpu.weight.nice           memory.events.local       pids.peak
cpuset.cpus               memory.high               syslog
cpuset.cpus.effective     memory.low
alpine:~# ls /sys/fs/cgroup/lxc.monitor.a1/
cgroup.controllers        cpuset.cpus.effective     memory.low
cgroup.events             cpuset.cpus.partition     memory.max
cgroup.freeze             cpuset.mems               memory.min
cgroup.kill               cpuset.mems.effective     memory.oom.group
cgroup.max.depth          hugetlb.2MB.current       memory.peak
cgroup.max.descendants    hugetlb.2MB.events        memory.reclaim
cgroup.procs              hugetlb.2MB.events.local  memory.stat
cgroup.stat               hugetlb.2MB.max           memory.swap.current
cgroup.subtree_control    hugetlb.2MB.numa_stat     memory.swap.events
cgroup.threads            hugetlb.2MB.rsvd.current  memory.swap.high
cgroup.type               hugetlb.2MB.rsvd.max      memory.swap.max
cpu.idle                  io.bfq.weight             memory.swap.peak
cpu.max                   io.latency                memory.zswap.current
cpu.max.burst             io.max                    memory.zswap.max
cpu.stat                  io.stat                   pids.current
cpu.stat.local            memory.current            pids.events
cpu.weight                memory.events             pids.max
cpu.weight.nice           memory.events.local       pids.peak
cpuset.cpus               memory.high

@stgraber
Copy link
Member

This will most likely need a change similar to that in #4373

@stgraber stgraber added Bug Confirmed to be a bug Easy Good for new contributors labels Dec 22, 2023
@anooprac
Copy link

anooprac commented Apr 1, 2024

Hi! We are UT Austin students doing open-source contributions for a project in a class. Would it be fine to have this project assigned to us? Thank you!

@hallyn
Copy link
Member

hallyn commented Apr 2, 2024

Anyone is welcome to post PRs or send patches - thanks :)

@DevonSchwartz
Copy link

I am with annoprac. We believe we have a solution to this, but we do not know how to test.

How do we use Jenkins and test lxc-top/ lxc-create on our local machine after calling meson build on the git repo?

@DevonSchwartz
Copy link

Screenshot from 2024-04-29 12-49-30
Also on ubuntu images

@DevonSchwartz
Copy link

Screenshot from 2024-04-29 13-44-23
This is the error we get running ./lxc-create from the main branch after building the repo.

@stgraber
Copy link
Member

You may need to do sudo make install to have all the system files put in place.

@DevonSchwartz
Copy link


Screenshot from 2024-04-29 16-21-06

We have the same issue with parse config file. We made sure to run sudo make install and sudo make all

@DevonSchwartz
Copy link

DevonSchwartz commented Apr 29, 2024

Screenshot from 2024-04-29 17-03-09
Apparently the capability passed through is "mac-admin" is this correct?
When I pass in "none" that creates a valid container, but then lxc-start will have issues.

@stgraber
Copy link
Member

Sounds like you built LXC without libcap enabled.

@stgraber
Copy link
Member

Make sure you have libcap-dev libseccomp-dev libcap-ng-dev libapparmor-dev for most features to be enabled.

@DevonSchwartz
Copy link

DevonSchwartz commented Apr 29, 2024

Make sure you have libcap-dev libseccomp-dev libcap-ng-dev libapparmor-dev for most features to be enabled.

Now we have valid capabilities, but we still fail to parse the config file

Screenshot from 2024-04-29 17-28-57

Update: Turns out I forgot to install the last two libraries. After installed we are able to create a container. Thank you so much for you help!

@DevonSchwartz
Copy link

We noticed that print and ERROR statements in do_lxcapi_get_cgroup_item() do not make it to the terminal. Is this expected behavior?

@DevonSchwartz
Copy link

We were able to get printing to work in do_lxcapi_get_cgroup_item(), and we traced the issue to open_at() inside of lxc_read_try_buf_at() in src/lxc/file_utils.c. The path or file descriptor that's passed in could be invalid.

Is memory.usage_in_bytes a valid path?

@stgraber
Copy link
Member

It's a valid path for cgroup1 but not for cgroup2.

Under cgroup2, the equivalent is memory.current (you can look those up in /sys/fs/cgroup/lxc.payload.NAME/`)

@DevonSchwartz
Copy link

It's a valid path for cgroup1 but not for cgroup2.

Under cgroup2, the equivalent is memory.current (you can look those up in /sys/fs/cgroup/lxc.payload.NAME/`)

Does this also apply to memory.limit_in_bytes, memory.kmem.usage_in_bytes, and other paths? Should they ALL have the name memory.current.[x] instead?

@stgraber
Copy link
Member

They all have cgroup2 equivalents.

We have some convenient logic to handle cgroup1 and cgroup2 in Incus that you can look at here https://github.com/lxc/incus/blob/main/internal/server/cgroup/abstraction.go to see both paths.

@DevonSchwartz
Copy link

How do we know which c-group the system is using?

@DevonSchwartz DevonSchwartz linked a pull request May 1, 2024 that will close this issue
@DevonSchwartz
Copy link

There are also some paths that are implemented for cgroup that do not have an equivalent for cgroup2. Is there another resource with the mapping from the cgroup paths to their cgroup2 equivalents.

@stgraber
Copy link
Member

stgraber commented May 1, 2024

Hmm, within the core code, we have pure_unified_system which basically tells us if we are on cgroup2 but it's not 100% fool proof as systems can run a mix of both.

An option would be to just try the cgroup2 file first and if that fails, fallback to the cgroup1 equivalent.

Which ones are you missing an equivalent for?

@DevonSchwartz
Copy link

We're missing the equivalent for
memory.usage_in_bytes
memory.limit_in_bytes
memory.kmem.usage_in_bytes
memory.kmem.limit_in_bytes
blkio.throttle.io_serviced

@stgraber
Copy link
Member

stgraber commented May 2, 2024

memory.usage_in_bytes => memory.current
memory.limit_in_bytes => memory.max

The other three we'd need to skip as they don't have straight up equivalents.

@DevonSchwartz
Copy link

We also don't have equivalents for cpuacct.stat. Does this have an equivalent?

@DevonSchwartz
Copy link

DevonSchwartz commented May 2, 2024

When we run containers locally we now get the following output:
lxc-start: u1: ../src/lxc/lxccontainer.c: wait_on_daemonized_start: 837 Received container state "ABORTING" instead of "RUNNING"
lxc-start: u1: ../src/lxc/tools/lxc_start.c: lxc_start_main: 307 The container failed to start
lxc-start: u1: ../src/lxc/tools/lxc_start.c: lxc_start_main: 310 To get more details, run the container in foreground mode
lxc-start: u1: ../src/lxc/tools/lxc_start.c: lxc_start_main: 312 Additional information can be obtained by setting the --logfile and --logpriority options

I don't think it is related to the lxc-top changes because I reverted to the main branch.
In the debug log there was a message: Failed to attach "vethNDAKnC" to bridge "lxcbr0", bridge interface doesn't exist

@stgraber
Copy link
Member

stgraber commented May 2, 2024

sudo ip link add dev lxcbr0 type bridge should do the trick

@stgraber
Copy link
Member

stgraber commented May 2, 2024

For cpuacct.usage, if you look at the file I referred to earlier, you'll see that you can use cpu.stat.

https://github.com/lxc/incus/blob/main/internal/server/cgroup/abstraction.go#L259

@DevonSchwartz
Copy link

DevonSchwartz commented May 2, 2024

For cpuacct.usage, if you look at the file I referred to earlier, you'll see that you can use cpu.stat.

https://github.com/lxc/incus/blob/main/internal/server/cgroup/abstraction.go#L259

Does that also work for cgroup path cpuacct.stat?

@stgraber
Copy link
Member

stgraber commented May 3, 2024

Oh, oops, Incus doesn't use the stat one. Though I'd suspect cpu.stat should be the equivalent for that then.

@DevonSchwartz
Copy link

Sounds good. I'll add a case for the cgroup2 equivalent for cpuacct.stat.

DevonSchwartz pushed a commit to DevonSchwartz/lxc_virt_project that referenced this issue May 3, 2024
Closes lxc#4376

Signed-off-by: Devon Schwartz <devon.s.schwartz@utexas.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug Easy Good for new contributors
Development

Successfully merging a pull request may close this issue.

5 participants