Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need better tooling to observe cgroups memory containment #5962

Open
garlick opened this issue May 13, 2024 · 0 comments
Open

need better tooling to observe cgroups memory containment #5962

garlick opened this issue May 13, 2024 · 0 comments

Comments

@garlick
Copy link
Member

garlick commented May 13, 2024

Problem: the procedure for checking if a job is running with memory containment (e.g. if the system was configured properly for it) is something like:

  1. Run a job
  2. Convert one of job's hostnames to rank with flux overlay lookup <hostname>
  3. List flux transient units on that rank with sudo flux exec -r <rank> systemctl --user list-units --type=service
  4. Show the unit status with e.g. sudo flux exec -r <rank> systemctl --user status imp-shell-1-fpUt87KFAo.service
  5. look for the Memory line in the output, e.g.
● imp-shell-1-fpUt87KFAo.service - User workload
     Loaded: loaded (/run/user/500/systemd/transient/imp-shell-1-fpUt87KFAo.service; transient)
  Transient: yes
     Active: active (running) since Mon 2024-05-13 12:02:44 PDT; 2min 20s ago
   Main PID: 733 (flux-imp)
      Tasks: 3 (limit: 1599)
     Memory: 5.9M (max: 1.4G)
        CPU: 203ms
     CGroup: /user.slice/user-500.slice/user@500.service/app.slice/imp-shell-1-fpUt87KFAo.service
             ├─733 
             ├─737 
             └─739 

If everything is set up, the Memory line should show the current usage and limit.

Two other notes:

  • These commands must run as the flux user in order to authenticate to the flux user's systemd instance. flux-exec(1) always runs processes as the instance owner but if running the commands locally, you would need to explicitly run them as the flux user.
  • In addition, the DBUS_SESSION_BUS_ADDRESS environment variable has to be set up e.g.
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/$(id -u flux)/bus

In 0.62.0 onward, flux-exec(1) sets that for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant