Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems impossible to define default cgroups settings for tasks #4085

Open
mramato opened this issue Feb 1, 2024 · 1 comment
Open

It seems impossible to define default cgroups settings for tasks #4085

mramato opened this issue Feb 1, 2024 · 1 comment

Comments

@mramato
Copy link

mramato commented Feb 1, 2024

Summary

It seems impossible to define default cgroups settings for tasks.

Description

  • We are using ECS via AWS Batch to launch multiple jobs with heavy i/o
  • The heavy i/o is causing noisy neighbor issues and we would like to limit it. Specifically, Batch (and ecs-agent) fail to spin up new Docker containers, they time out do to delays caused by the heavy I/O. And this is on a Nitro SSD.
  • Since there is no way to limit file i/o via AWS Batch, we wanted to configure cgroups (v1) as part of our launch configuration to accomplish this.
  • The partial launch configuration is:
    # Get the major/minor versions of raid drive and set the limits to 50% so one job won't block other jobs from starting
    MAJOR=`stat -c %t /dev/md0`
    MINOR=`stat -c %T /dev/md0`
    
    printf "$MAJOR:$MINOR  4000000000" | sudo tee /sys/fs/cgroup/blkio/blkio.throttle.read_bps_device > /dev/null
    printf "$MAJOR:$MINOR  500000" | sudo tee /sys/fs/cgroup/blkio/blkio.throttle.read_iops_device > /dev/null
    printf "$MAJOR:$MINOR  2800000000" | sudo tee /sys/fs/cgroup/blkio/blkio.throttle.write_bps_device > /dev/null
    printf "$MAJOR:$MINOR  400000" | sudo tee /sys/fs/cgroup/blkio/blkio.throttle.write_iops_device > /dev/null
    
    sudo mkdir -p /sys/fs/cgroup/blkio/ecs
    printf "$MAJOR:$MINOR  4000000000" | sudo tee /sys/fs/cgroup/blkio/ecs/blkio.throttle.read_bps_device > /dev/null
    printf "$MAJOR:$MINOR  500000" | sudo tee /sys/fs/cgroup/blkio/ecs/blkio.throttle.read_iops_device > /dev/null
    printf "$MAJOR:$MINOR  2800000000" | sudo tee /sys/fs/cgroup/blkio/ecs/blkio.throttle.write_bps_device > /dev/null
    printf "$MAJOR:$MINOR  400000" | sudo tee /sys/fs/cgroup/blkio/ecs/blkio.throttle.write_iops_device > /dev/null
    
    sudo systemctl daemon-reload
    sudo systemctl restart docker
    

Expected Behavior

The launch configuration sets these values as expected, and since the ecs-agent starts jobs under the ecs cgroup, the expectation is that the above would cause all tasks to inherit these settings.

Observed Behavior

All tasks start with empty cgroups and do not inherit from the ecs parent.

If I start a task and then manually adjust the cgroups from the host that works. But since we don't know the task id / container id ahead of time, this is not a viable solution for us.

Finally, there seems to be almost no documentation available for any of this, though limiting i/o so that multiple jobs can safely run on a single node seems to be a pretty standard use case.

Environment Details

sh-4.2$ sudo docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.0.0+unknown)

Server:
 Containers: 2
  Running: 2
  Paused: 0
  Stopped: 0
 Images: 5
 Server Version: 20.10.25
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f
 runc version: 4bccb38cc9cf198d52bebf2b3a90cd14e7af8c06
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.14.336-253.554.amzn2.x86_64
 Operating System: Amazon Linux 2
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 247.9GiB
 Name: ip-10-30-13-103.ec2.internal
 ID: Z4SU:C7UF:5AZX:UWAD:K4JH:A47W:4OIW:ATJQ:UF2W:43NX:BFBU:ZUOS
 Docker Root Dir: /data
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
sh-4.2$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        124G     0  124G   0% /dev
tmpfs           124G     0  124G   0% /dev/shm
tmpfs           124G  428K  124G   1% /run
tmpfs           124G     0  124G   0% /sys/fs/cgroup
/dev/nvme0n1p1   30G  1.9G   29G   7% /
/dev/md0        6.9T  184G  6.7T   3% /data
sh-4.2$ curl http://localhost:51678/v1/metadata | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   344  100   344    0     0  91392      0 --:--:-- --:--:-- --:--:--  111k
{
  "Cluster": "cesiumion-tiling-lt-0056ce2591ed453c8-14_Batch_3c431f46-4a1d-3183-a3cb-dfe21fb7b0d4",
  "ContainerInstanceArn": "arn:aws:ecs:us-east-1:899618071680:container-instance/cesiumion-tiling-lt-0056ce2591ed453c8-14_Batch_3c431f46-4a1d-3183-a3cb-dfe21fb7b0d4/833cf211563e4881bdf3baa3f669189f",
  "Version": "Amazon ECS Agent - v1.80.0 (*61c8a8c5)"
}

Supporting Log Snippets

TBD

@mramato
Copy link
Author

mramato commented Feb 2, 2024

  • We switched to cgroups 2 and Amazon Linux 2023, same problem.
  • We are testing a workaround where we use inotifywait to watch cgroups and set them once ECS spins up a task. Seems to work, but definitely feels super hacky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant