Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuning of individual kernel threads #628

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

adriaan42
Copy link
Contributor

@adriaan42 adriaan42 commented Apr 17, 2024

In combination with #596 and #580, this PR implements the third feature needed to dynamically tune all relevant aspects of a realtime application using a dedicated HW device (typically a NIC).

Two things might need some discussion:

  • In my implementation I kept the basic idea that one instace can cover a number of different groups of threads. That makes it easy to migrate from the current scheduler plugin, but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py. The alternative would be to have one plugin instance per "group", which would make the profiles much longer.
    group.ktimers=0:f:2:*:^\[ktimers
    would become something like
    [kthread_ktimers]
    type=kthread
    regex=^ktimers
    policy=fifo
    sched_prio=2
    affinity=*
  • I copied the approach of using perf to monitor for creation of new threads. That means that when running both the scheduler plugin and the kthread plugin, we'd have two threads doing the same thing. For my applications that's not a problem, because I no longer use the scheduler plugin at all:
    • scheduler handles three things: IRQ affinities, kernel threads, and userland threads
    • For IRQ affinities I can use the irq plugin
    • For kernel threads I can use kthread
    • For userland threads I use systemd and cgroupv2, and I don't want TuneD to touch them

Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
self._kthread_scan(initial=True)

@classmethod
def _get_config_options(self):

Check notice

Code scanning / CodeQL

First parameter of a class method is not named 'cls'

Class methods or methods of a type deriving from type should have 'cls', rather than 'self', as their first parameter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix. The base plugin should be also fixed, but this should be done in new PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done... one of these days I need to learn how to run those checks locally before I push... :)

# command definitions: entry point for device tuning
#
@command_custom("_instance", per_device=True)
def _instance_kthread(self, start, value, device, verify, ignore_missing):

Check notice

Code scanning / CodeQL

Explicit returns mixed with implicit (fall through) returns

Mixing implicit and explicit returns may indicate an error as implicit returns always return None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added return None.

"""
def __init__(self, monitor_repository, storage_factory, hardware_inventory, device_matcher, device_matcher_udev, plugin_instance_factory, global_cfg, variables):
super(KthreadPlugin, self).__init__(monitor_repository, storage_factory, hardware_inventory, device_matcher, device_matcher_udev, plugin_instance_factory, global_cfg, variables)
self._has_dynamic_options = True

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class

Assignment overwrites attribute _has_dynamic_options, which was previously defined in superclass [Plugin](1).
@yarda
Copy link
Contributor

yarda commented May 23, 2024

but means we still need _has_dynamic_options, which is marked as a hack in plugins/base.py.

It's OK for me, in long-term it's a candidate for rewrite/refactor, but there are other plugins using it as well. We will probably keep the idea and if we change the implementation, this could be then updated in all affected plugins the same way.

@yarda
Copy link
Contributor

yarda commented May 23, 2024

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

It's OK if you are not using some plugin. We even wanted to add global configuration option allowing selective disablement of specific plugins in the stock profiles.

Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
@adriaan42
Copy link
Contributor Author

Regarding the cgroups, there is support for cgroups v1 in the scheduler plugin and we would also like to add support for the v2 for completeness. It could be useful for somebody.

I found the whole cgroup topic to be rather tricky, because in modern systems, SystemD is the "cgroup manager", and it owns (by convention) the cgroup tree. So any creation of new cgroups should happen via SystemD, and can then use Delegation to create further sub-groups.

I've had some success with:

  • set AllowedCPUs on all the default slices (system.slice, user.slice, init.scope) to restrict all "normal" processes. This to some extent replaces the isolcpus= kernel option.
  • Create an isolated.slice using SystemD, with access to the desired CPUs, and then use Slice=isolated in my service file (or systemd-run --slice=isolated when launching from a shell) to gain access to the isolated CPUs.

But simply having TuneD move processes around seems like it could have unwanted side-effects, and should be handled with care...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants