Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNF and YUM operation improvements #1083

Open
gvalkov opened this issue Apr 10, 2024 · 0 comments
Open

DNF and YUM operation improvements #1083

gvalkov opened this issue Apr 10, 2024 · 0 comments

Comments

@gvalkov
Copy link

gvalkov commented Apr 10, 2024

Is your feature request related to a problem? Please describe

The dnf.packages and yum.packages operations use the RpmPackageProvides fact internally to map binaries to package names. For example dnf.packages(packages=["vim"]) first runs dnf repoquery --whatprovides vim and then executes dnf install -y vim-enhanced, since vim is provided by vim-enhanced on RHEL derived distros.

This is inconsistent with other package manager operations and also somewhat surprising (at least I would never expect it).

This also slows things down considerably, since the RpmPackageProvides fact is checked for each package sequentially. For example, given the following script and all the listed packages already installed:

PACKAGES = [
    "sudo",
    "tmux",
    "htop",
    "rsync",
    "net-tools",
    "less",
    "nnn",
    "ncdu",
    "hostname",
    "unzip",
    "tree",
    "time",
    "lsof",
    "tar",
    "patch",
    "smartmontools",
    "sysbench",
    "sysstat",
    "msr-tools",
]
dnf.packages(packages=PACKAGES, update=False)

Currently:

$ time pyinfra --dry -v --ssh-user=root test deploy.py
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    [test] Connected

--> Preparing operations...
--> Preparing Operations...
    Loading: deploy.py
    [test] Loaded fact rpm.RpmPackages
    [test] Loaded fact rpm.RpmPackageProvides (name=sudo)
    [test] noop: package sudo is installed (1.9.5p2-1.el8_9)
    [test] Loaded fact rpm.RpmPackageProvides (name=tmux)
    [test] noop: package tmux is installed (2.7-3.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=htop)
    [test] noop: package htop is installed (3.2.1-1.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=rsync)
    [test] noop: package rsync is installed (3.1.3-19.el8_7.1)
    [test] Loaded fact rpm.RpmPackageProvides (name=net-tools)
    [test] noop: package net-tools is installed (2.0-0.52.20160912git.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=less)
    [test] noop: package less is installed (530-2.el8_9)
    [test] Loaded fact rpm.RpmPackageProvides (name=nnn)
    [test] noop: package nnn is installed (4.6-1.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=ncdu)
    [test] noop: package ncdu is installed (1.19-1.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=hostname)
    [test] noop: package hostname is installed (3.20-6.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=unzip)
    [test] noop: package unzip is installed (6.0-46.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=tree)
    [test] noop: package tree is installed (1.7.0-15.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=time)
    [test] noop: package time is installed (1.9-3.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=lsof)
    [test] noop: package lsof is installed (4.93.2-1.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=tar)
    [test] noop: package tar is installed (1.30-9.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=patch)
    [test] noop: package patch is installed (2.7.6-11.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=smartmontools)
    [test] noop: package smartmontools is installed (7.1-2.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=sysbench)
    [test] noop: package sysbench is installed (1.0.20-5.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=sysstat)
    [test] noop: package sysstat is installed (11.7.3-11.el8)
    [test] Loaded fact rpm.RpmPackageProvides (name=msr-tools)
    [test] noop: package msr-tools is installed (1.3-17.el8)
    [test] Ready: deploy.py

--> Detected changes:
    Operation                                                                                                                                                                                                                       Change   Conditional Change   
    dnf.packages (packages=['sudo', 'tmux', 'htop', 'rsync', 'net-tools', 'less', 'nnn', 'ncdu', 'hostname', 'unzip', 'tree', 'time', 'lsof', 'tar', 'patch', 'smartmontools', 'sysbench', 'sysstat', 'msr-tools'], update=False)   -        -                    

--> Disconnecting from hosts...
pyinfra --dry -v --ssh-user=root test deploy.py  0.67s user 0.06s system 4% cpu 16.234 total

That is 16s just to determine that nothings needs to be done.

Without expand_package_fact=RpmPackageProvides it can be much faster:

$ time pyinfra --dry -v test deploy.py
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    [test] Connected
    [pyinfra.connectors.ssh] Command exit status: 0
--> Preparing operations...
--> Preparing Operations...
    Loading: deploy.py
    [test] Loaded fact rpm.RpmPackages
    [test] noop: package sudo is installed (1.9.5p2-1.el8_9)
    [test] noop: package tmux is installed (2.7-3.el8)
    [test] noop: package htop is installed (3.2.1-1.el8)
    [test] noop: package rsync is installed (3.1.3-19.el8_7.1)
    [test] noop: package net-tools is installed (2.0-0.52.20160912git.el8)
    [test] noop: package less is installed (530-2.el8_9)
    [test] noop: package nnn is installed (4.6-1.el8)
    [test] noop: package ncdu is installed (1.19-1.el8)
    [test] noop: package hostname is installed (3.20-6.el8)
    [test] noop: package unzip is installed (6.0-46.el8)
    [test] noop: package tree is installed (1.7.0-15.el8)
    [test] noop: package time is installed (1.9-3.el8)
    [test] noop: package lsof is installed (4.93.2-1.el8)
    [test] noop: package tar is installed (1.30-9.el8)
    [test] noop: package patch is installed (2.7.6-11.el8)
    [test] noop: package smartmontools is installed (7.1-2.el8)
    [test] noop: package sysbench is installed (1.0.20-5.el8)
    [test] noop: package sysstat is installed (11.7.3-11.el8)
    [test] noop: package msr-tools is installed (1.3-17.el8)
    [test] Ready: deploy.py

--> Detected changes:
    Operation                                                                                                                                                                                                                       Change   Conditional Change   
    dnf.packages (packages=['sudo', 'tmux', 'htop', 'rsync', 'net-tools', 'less', 'nnn', 'ncdu', 'hostname', 'unzip', 'tree', 'time', 'lsof', 'tar', 'patch', 'smartmontools', 'sysbench', 'sysstat', 'msr-tools'], update=False)   -        -                    

--> Disconnecting from hosts...
pyinfra --dry -v test deploy.py  0.49s user 0.04s system 45% cpu 1.164 total

Describe the solution you'd like

Drop expand_package_fact from dnf.packages or at least don't make it the default. It's both slow and probably not what most users would expect. If it needs to stay, an optimization would be to run a single repoquery with all the package names.

If you agree, I can provide a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant