Skip to content

Releases: distributed-system-analysis/pbench

v0.73.0 (agent only)

31 Aug 04:41
7bd2a1a
Compare
Choose a tag to compare

Release notes TBD.

Installation

The basic installation notes can be found here.

There are no installation changes since the v0.72.0 release, but the release notes for that release describe some important changes. In particular, there are two pbench repos that are needed: the original pbench-repo and the version-specific pbench-0.73 repo. The former contains some support RPMs that are needed by every release, the latter contains the RPM of the pbench-agent for the current release. The Ansible playbooks and roles take care of these details, so you are encouraged to use them for installation. Otherwise, please adjust your installation procedures accordingly. Note that you will have to modify pbench_copr_repo to pbench-0.73 in your inventory file.

In the case where you cannot or do not want to use Ansible, the two repos can be enabled like this (but you will have to manually install the config file and ssh key file):

 dnf copr enable ndokos/pbench
 dnf copr enable ndokos/pbench-0.73

The RPM version you should get when installing is 0.73.0-1g7bd2a1a6a.

Changelog

The complete changelog for this release (including server and dashboard changes that are not described above) is as follows:

v0.72.2 (agent only)

24 Aug 18:47
366e532
Compare
Choose a tag to compare

This release fixes a problem that was introduced with the release of v5.0.0 of the python redis module. That broke the tool-meister. We "fix" the problem by version-locking the module to its pre-5.0.0 incarnation. A more comprehensive fix is not yet available, so the imminent v0.73 release will use the same version lock "fix".

The RPM version you should get when installing is 0.72.2-1g366e5323e.

v0.72.1 (agent only)

12 Jun 21:01
7d727fe
Compare
Choose a tag to compare

This release is a point release for v0.72.0. It fixes one problem that was found just as we were releasing v0.72.0. The problem was that pbench-postprocess-tools was not handling labeled hosts correctly (labels may be attached to hosts during tool registration). See issue #3454 and PR #3456 for a fuller description.

Installation is as described in the v0.72.0 release notes. The version of the pbench-agent RPM for this release is 0.72.1-2gf8fb65d92.

For the rest of the changes, see the v0.72.0 release notes.

v0.72.0 (agent only)

06 Jun 08:13
4391fbc
Compare
Choose a tag to compare

This is a minor release of the Pbench agent. It consists mostly of bug fixes and deletions of deprecated components that were announced previously.

The most visible parts of the changes are summarized below. The full change log can be found at Changelog, but note that most of the 408 commits are not for the agent: the list includes server and dashboard changes which are already incorporated into the current production Pbench server and dashboard.

Installation

The COPR repo names have changed: the pbench-agent RPM is now found in the pbench-0.72 repo. The reason for this change was that COPR does not allow us to keep different versions of the RPM in the same repo: it deleted the older one as soon as a newer one was built. We needed that capability however, so we chose to go with separate repos for each release.

There are some RPMs that are shared between versions (e.g. pbench-sysstat). We maintain those in the original COPR  pbench repo . The upshot is that the user now has to install two COPR repos (if doing it manually). The Ansible roles have been modified to do that, so if you are installing through Ansible, you don’t need to worry about that, except for adding the following line to the [servers:vars] section of the inventory file:

pbench_repo_name = pbench-0.72

The inventory file should look like this:

[servers]
<host1>
<host2>
...

[servers:vars]
pbench_repo_name = pbench-0.72

pbench_key_url =  <URL to directory containing the key>
pbench_config_url = <URL to directory containing the config file>

The version of the pbench-agent RPM for this release is 0.72.0-1g4391fbc01.

Ansible roles

New Ansible roles have been uploaded to Ansible Galaxy, so you will need to update your installation of those roles:

ansible-galaxy collection install pbench.agent -f

In addition, there is no default for pbench_repo_name any longer (see above for an explanation). You will need to set it in your inventory file like this:

...
[servers:vars]
pbench_repo_name = pbench-0.72

pbench_key_url =  <URL for ssh key>
pbench_config_url =  <URL for config file>

New user-visible utilities

The pbench-tools-kill script has been added. This is a new utility intended to provide complete cleanup if some Tool Meister component refuses to start. This is usually because there are old processes running and keeping network ports busy. The tool cleans up such errant processes so you can start afresh.

Bug fixes and enhancements

The Tool Meister subsystem has undergone a few fixes and some enhancements, primarily in logging and reporting of status; also, the state-signals work was integrated into the Tool Meister (thanks Mustafa Eyceoz!). pbench-linpack has had some fixes (primarily thanks to Lukas Doktor). In addition, pbench-specjbb, pbench-uperf and pbench-fio have had bug fixes.

The user-tool script was broken and that necessitated a few changes to the Tool Meister and also to pbench-postprocess-tools. Thanks to Keith Valin for finding the problem and to Michey Mehta for debugging it.

As usual, if you find problems, please open an issue on Github.

Support for latest RHEL and Fedora versions

V0.72.0 supports RHEL 8.8, RHEL9.2 and Fedora 37 and 38, in addition to the previously supported RHEL versions. Fedora 36 is not supported any longer (primarily because COPR has dropped it).

Deprecation notices and deletions of previously deprecated items

A default tool set was implicitly used by pbench-register-tool-set . It was deprecated in v0.71.0 and is still deprecated in v0.72.0 - it will finally go away in the next release and you will need to choose a tool set explicitly when registering tools. The name for what used to be the default tool set is legacy. In addition, there are light, medium and heavy tool sets. Not supplying an argument for the tool set is still a warning but it is going to become an error in the next release.

The pbench-generate-token command (see the "Futures" section below) is deprecated and will be deleted in the next release of the agent.

The following have been previously deprecated and have now been deleted: pbench-run-benchmark, pbench-cyclictest, pbench-dbench, pbench-iozone, pbench-migrate, pbench-netperf . In addition, two contributed bench-scripts, pbench-bzt and pbench-mpt, have been deleted.

The stockpile subproject has been removed, as well as the script pbench-avg-stddev which was unused.

Futures

The following describes some details about future directions. One component of that is containerization. The Pbench server on production is already running in a container. Here we describe the current, experimental version of the containerized agent. The second component is user authentication and ownership of datasets. That is work in progress and what we describe here is what is available in the v0.72.0 release of the Pbench agent and the current production Pbench server.

This section is meant as a foretaste of things to come. We expect most users to continue using RPMs for installing the agent and the pbench-move-results (or pbench-copy-results) utility to upload datasets, just as with previous versions of the Pbench agent.

If you'd like to kick the tires a bit, read on and feel free to experiment, although we recommend that you don't use the following for "real" data. Things should work but that's not guaranteed: if you do venture forth and encounter problems, we would really like to know about them. Thanks in advance!

Containerized Pbench agent

There is an experimental Pbench agent container, intended as a demonstration project, available in the contrib/containerized-pbench directory of the b0.72 branch (the branch that was used to cut this release). The directory contains a README file, a pbench command and a demo script, pbench-demo.

The demo script (which is to be thought of more as "executable documentation" than anything else at this point) uses the pbench command to execute a series of commands inside a container. The first time that the pbench command runs, it realizes that there is no container yet, so it downloads the pbench-agent-all-fedora-36:b0.72 image from quay.io and starts the container. It then executes the first command that it was given inside that container. Subsequent invocations of the pbench command execute their arguments inside that container, first registering tools, then listing the tools, then running a simple fio benchmark under pbench-user-benchmark and finally pushing the results to the configured Pbench server. Although this is a very simple set of commands, it indicates how things would go in a more complicated invocation.

There are a couple of significant caveats: this version of the demo script does NOT use pbench-move-results to send the results to the server (although it be could modified to do so). Instead it passes an authentication token to the pbench-results-move command (see below) to push the results. That token is generated by the pbench-generate-token script, which is invoked at the very beginning of the demo script: that script asks for a user ID and a password and then generates and stores that token in a file (the file is stored in a directory which is mapped into the container from the outside, so the token persists beyond the run of the demo script). That means you have to have a user ID and a password on the Pbench server before generating the token.

To create a user ID with a password, you have to visit the Pbench dashboard and click on Login in the upper right. That will pop up a login/sign up dialog through which you can create an account that will then allow you to generate a token (or login to the dashboard and look around). N.B. All data sent to the AWS Pbench Satellite or pbench.app.intlab.redhat.com “pass-through” server is owned by a legacy user: it’s all visible, but can’t be modified.

The trouble is that this is a very temporary arrangement: we expect that very soon, you will be able to use Red Hat SSO for logging in and generating the document. The accounts created as above will go away, as will the pbench-generate-token script which is already deprecated. Any datasets submitted through this mechanism will therefore be orphaned, hence the imprecation to use this to kick the tires, not for storing "real" results that you don't want to lose. We are NOT planning to migrate any such results.

New utilities to upload datasets as an authenticated user

The existing pbench-move-results command works in exactly the same way as before: it uses ssh/scp to copy the results to a pass-through server, but there is no notion of a user owning those
results. Although we expect that to continue to be the main mode of operation for users of v0.72.0, we are moving towards a future where users will authenticate using SSO and that authenticated identity will become the owner of the datasets that are submitted by that user. The new commands pbench-results-move and pbench-results-push use an HTTP PUT to send results to a v1.0 Pbench Server which now provides a RESTful API to its services (pbench-results-push is used by the pass-through server to send the results to a Pbench server using a legacy user ID). The new commands will eventually supplant the existing `pbench-move-resu...

Read more

Pbench Server release updating v0.69.10

08 Mar 11:57
390f979
Compare
Choose a tag to compare

The v0.69.10-server release is a maintenance release for the Pbench Server to remove the use of pbench-report-status CLI interface from all the "Background Tasks" (cron jobs), and add JSON log records for reporting purposes (e.g. see mmjsonparse module of rsyslog).

v0.71.0 (agent only)

01 Jul 18:31
8591073
Compare
Choose a tag to compare

This is a very significant "minor" release of the pbench-agent code base, primarily to deliver the new "Tool Meister" sub-system.

NOTE WELL:

  • The notion of a "default" tool set is being deprecated and will be removed in the upcoming Pbench Agent v1.0 release. To replace it, the Pbench Agent is introducing a few named tool sets. See "Default Tool Set is Deprecated; Named tool sets introduced" below.
  • All tools registered prior to installing v0.71 must be re-registered; tools registered locally, or remotely, on a host with v0.69 or earlier version of the pbench-agent will be ignored. See "Tool registration kept local to the host where registration happens" below.

This release also delivers:

  • Support for RHEL 9 & CentOS Stream 9
  • Support of Prometheus and PCP tool data collection
  • Independence of Pbench Agent "tool" Scripts
  • Removal of gratuitous manipulation of networking firewalls
  • Removal of gratuitous software installation, only checks for requirements
    • True for both tools and benchmark convenience script requirements
    • Change to check command versions instead of RPM versions for pbench-fio, pbench-linpack, and pbench-uperf
  • The pbench-linpack benchmark convenience script now provides result graphs, JSON data files, and supports execution on one or more local / remote hosts
  • Required use of --user with pbench-move-results/pbench-copy-results
  • Support for the new HTTP PUT method of posting tar balls
  • Removal of the dependency on the SCL (Software Collections Library)
  • Dropped support for the pbench-trafficgen benchmark convenience script
  • Deprecation announcements for unused benchmark convenience scripts:
    • pbench-run-benchmark, pbench-cyclictest, pbench-dbench, pbench-iozone, pbench-migrate, and pbench-netperf
  • Semi-Public CLI Additions, Changes, and Removals
  • Many, many, bug fixes and behavioral improvements

You can review the Full ChangeLog on GitHub (all 560+ commits, tags b0.69-bp to v0.71.0), or read a summary with relevant details below.

We did not bump the "major" release version number with these changes because we still don't consider all the necessary functionality in place for such a major version bump.

Note that work on the v0.71 release started in earnest with the v0.69.3-agent release (tagged as b0.69-bp). A number of bug fixes and behaviors from the v0.71 work have already been back-ported and delivered in the various v0.69.* releases since then. These release notes will highlight only the behavioral changes that have not been back-ported previously.

This release supports RHEL 7.9, RHEL 8.6, RHEL 9, CentOS-Stream 8, CentOS-Stream 9 and Fedora 35. For various reasons, it does NOT support earlier versions of RHEL 8 (ansible vs. ansible-core dependency problem), RHEL 9.1 (missing repos) or Fedora 36 (python-3.10 problems). If you need support for any of these, please talk to us: we will do our best to accommodate you in some way, but there is no guarantee.

Installation

There are no installation changes in this release: see the Getting Started Guide for how to install or update.

After installation or update, you should have version 0.71.0-3g85910732a of the pbench-agent RPM installed.

RPMs are available from Fedora COPR, covering Fedora 35 (x86_64 only), EPEL 7, 8, & 9 (x86_64 and aarch64), and CentOS Stream 8 & 9 (x86_64 and aarch64), but please note there are problems with some distros as described above.

There are Ansible playbooks available via Ansible Galaxy to install the pbench-agent, and the pieces needed (key and configuration files) to be able to send results to a Pbench Server. To use the RPMs provided above via COPR with the playbooks, your inventory file needs to include the fedoraproject_username variable set to ndokos, for example:

...

[servers:vars]
fedoraproject_username = ndokos
pbench_repo_name = pbench-test

...

Alternatively, one can specify fedoraproject_username on the command line, rather than having it specified in the inventory file:

ansible-playbook -i <inventory> <playbook> -e '{fedoraproject_username: ndokos}' -e '{pbench_repo_name: pbench-test}'

NOTE WELL: If the inventory file also has a definition for pbench_repo_url_prefix (which was standard practice before fedoraproject_username was introduced), it needs to be deleted, otherwise it will override the default repo URL and the fedoraproject_username change will not take effect.

While we don't include installation instructions for the new node-exporter and dcgm tools in the published documentation, you can find a manual installation procedure for the Prometheus "node_exporter" and references to the Nvidia "DCGM" documentation in the agent/tool-scripts/README.

Container images built using the above RPMs are available in the Pbench organization in the Quay.io container image repository using tags latest, v0.71.0, and 0b7f55850.

Summary of Changes

Default Tool Set is Deprecated; Named tool sets introduced

The notion of a "default" tool set is being deprecated and will be removed in the upcoming Pbench Agent v1.0 release. In preparation for this deprecation, we have added additional named tool sets for users to consider replacing the "default" tool set.

This deprecation announcement is to address the very heavy-weight tools employed by the "default" tool set, including pidstat, proc-interrupts, and perf (aka perf record).

The four named tool sets added are:

  • legacy: iostat, mpstat, perf, pidstat, proc-interrupts, proc-vmstat, sar, turbostat (the current "default" tool set)
  • light: vmstat
  • medium: ${light}, iostat, sar (this will be the new default tool set Pbench Agent v1.0)
  • heavy: ${medium}, perf, pidstat, proc-interrupts, proc-vmstat, turbostat

Users are not required to use the pre-defined tool sets: a user may register whatever tools they like; or, a user may define a custom, named tool set in /opt/pbench-agent/config/pbench-agent.cfg (follow the pattern of the default tool set definitions in /opt/pbench-agent/config/pbench-agent-default.cfg -- note, we don't support modifications to the default configuration file).

In addition to the "default" tool set deprecation, the --toolset option is also deprecated and will be removed with the Pbench Agent v1.0 release. This is due to the fact that a tool set name will also be required going forward with the v1.0 release.

As a reminder, if you are using the "default" tool set, you need to ensure the pbench-sysstat, perf, and kernel-tools (which provides turbostat) RPMs are installed.

Support for RHEL 9 & CentOS Stream 9

Support for RHEL & CentOS Stream 9 is provided in this release.

The New "Tool Meister" Sub-System

The "Tool Meister" sub-system (introduced by PR #1248) is the major piece of functionality delivered with the release of v0.71 of the pbench-agent.

This is a significant change, where the pbench-agent first orchestrates the instantiation of a "Tool Meister" process on all hosts registered with tools, using a Redis instance to coordinate their operation, and the new "Tool Data Sink" process handles the collection of data into the pbench run directory hierarchy. This effectively eliminates all remote SSH operations for individual tools except the initial one per host to create each Tool Meister instance.

One Tool Meister instance is created per registered host, and then a single Tool Data Sink instance is created on the host where the benchmark convenience script is run. The Tool Meister instances are responsible for running the registered tools on their respective host, collecting the data generated as appropriate. The Tool Data Sink is responsible for collecting and storing locally all data sent to it from the deployed Tool Meister instances.

User-Controlled Orchestration of "Tool Meister" Sub-System via Container Images

Container images are provided for the constituent components of the Tool Meister sub-system, the Tool Meister image and the Tool Data Sink image. The images allow for the orchestration of the Tool Meister sub-system to be handled by the user instead of automatically by the pbench-agent.

The "Tool Meister" Sub-System with No Tools

While this is not a new feature of the Pbench Agent, it is worth noting that when no tools are registered, the "Tool Meister" sub-system is not deployed and the bench scripts still execute normally.

Tool registration kept local to the host where registration happens

Along with the new "Tool Meister" sub-system comes a subtle, but significant, change to how tools are registered.

Prior to v0.71, tool registration for remote hosts was recorded locally, and also remotely via ssh.

With v0.71, tools are recorded only locally when they are registered and the validation of remote hosts is deferred until the workload is run. During its initialization, the Tool Meister sub-system now reports when registered tools are not present on registered hosts, and, if a tool is not installed, an error message will be displayed, and the "bench-script" will exit with a failure code.

The registered tools are recorded in a local directory off of the "pbench_run" directory, by default /var/lib/pbench-agent/tools-v1-<name>, where <name> is the name of the Tool Group under which the tools were registered.

...

Read more

v0.71.0-beta.0 (agent-only)

25 May 20:41
Compare
Choose a tag to compare
Pre-release

This is a very significant "minor" release of the pbench-agent code base, primarily to deliver the new "Tool Meister" sub-system.

It also delivers:

  • Support for RHEL 9 & CentOS Stream 9
  • Tool registration kept local to the host where registration happens
  • Support of Prometheus and PCP tool data collection
  • Independence of Pbench Agent "tool" Scripts
  • Reduction of the default tool set to iostat, sar, & vmstat tools
  • Removal of gratuitous manipulation of networking firewalls
  • Removal of gratuitous software installation, only checks for requirements
    • True for both tools and benchmark convenience script requirements
    • Change to check command versions instead of RPM versions for pbench-fio, pbench-linpack, and pbench-uperf
  • The pbench-linpack benchmark convenience script now provides result graphs, JSON data files, and supports execution on one or more local / remote hosts
  • Required use of --user with pbench-move-results/pbench-copy-results
  • Support for the new HTTP PUT method of posting tar balls
  • Removal of the dependency on the SCL (Software Collections Library)
  • Dropped support for the pbench-trafficgen benchmark convenience script
  • Deprecation announcements for unused benchmark convenience scripts:
    • pbench-run-benchmark, pbench-cyclictest, pbench-dbench, pbench-iozone, pbench-migrate, and pbench-netperf
  • Semi-Public CLI Additions, Changes, and Removals
  • Many, many, bug fixes and behavioral improvements

You can review the Full ChangeLog on GitHub (all 560+ commits, tags b0.69-bp to v0.71.0-beta.0), or read a summary with relevant details below.

We did not bump the "major" release version number with these changes because we still don't consider all the necessary functionality in place for such a major version bump.

Note that work on the v0.71 release started in earnest with the v0.69.3-agent release (tagged as b0.69-bp). A number of bug fixes and behaviors from the v0.71 work have already been back-ported and delivered in the various v0.69.* releases since then. These release notes will highlight only the behavioral changes that have not been back-ported previously.

Installation

There are no installation changes in this release: see the Getting Started Guide for how to install or update.

After installation or update, you should have version 0.71.0-XXgXXXXXXXXX of the pbench-agent RPM installed.

RPMs are available from Fedora COPR, covering Fedora 34, 35, 36, EPEL 7, 8, 9, and CentOS Stream 8 & 9.

There are Ansible playbooks available via Ansible Galaxy to install the pbench-agent, and the pieces needed (key and configuration files) to be able to send results to a Pbench Server. To use the RPMs provided above via COPR with the playbooks, your inventory file needs to include the fedoraproject_username variable set to ndokos, for example:

...

[servers:vars]
fedoraproject_username = ndokos
pbench_repo_name = pbench-test

...

Alternatively, one can specify fedoraproject_username on the command line, rather than having it specified in the inventory file:

ansible-playbook -i <inventory> <playbook> -e '{fedoraproject_username: ndokos}' -e '{pbench_repo_name: pbench-test}'

NOTE WELL: If the inventory file also has a definition for pbench_repo_url_prefix (which was standard practice before fedoraproject_username was introduced), it needs to be deleted, otherwise it will override the default repo URL and the fedoraproject_username change will not take effect.

While we don't include installation instructions for the new node-exporter and dcgm tools in the published documentation, you can find a manual installation procedure for the Prometheus "node_exporter" and references to the Nvidia "DCGM" documentation in the agent/tool-scripts/README.

Container images built using the above RPMs are available in the Pbench organization in the Quay.io container image repository using tags beta, v0.71.0-XX, and XXXXXXXXX.

Summary of Changes

Support for RHEL 9 & CentOS Stream 9

Support for RHEL & CentOS Stream 9 is provided in this release.

The New "Tool Meister" Sub-System

The "Tool Meister" sub-system (introduced by PR #1248) is the major piece of functionality delivered with the release of v0.71 of the pbench-agent.

This is a significant change, where the pbench-agent first orchestrates the instantiation of a "Tool Meister" process on all hosts registered with tools, using a Redis instance to coordinate their operation, and the new "Tool Data Sink" process handles the collection of data into the pbench run directory hierarchy. This effectively eliminates all remote SSH operations for individual tools except the initial one per host to create each Tool Meister instance.

One Tool Meister instance is created per registered host, and then a single Tool Data Sink instance is created on the host where the benchmark convenience script is run. The Tool Meister instances are responsible for running the registered tools for that host, collecting the data generated as appropriate. The Tool Data Sink is responsible for collecting and storing locally all data sent to it from the deployed Tool Meister instances.

User Orchestration of "Tool Meister" Sub-System

Container images are provided for the constituent components of the Tool Meister sub-system, the Tool Meister image and the Tool Data Sink image. The images allow for the orchestration of the Tool Meister sub-system to be handled by the user instead of automatically by the pbench-agent.

The "Tool Meister" Sub-System with No Tools

While this is not a new feature of the Pbench Agent, it is worth noting that when no tools are registered, the "Tool Meister" sub-system is not deployed and the bench scripts still execute normally.

All Tool Registration Handled Locally

Along with the new "Tool Meister" sub-system comes a subtle, but significant, change to how tools are registered.

Prior to v0.71, tool registration for remote hosts was recorded locally, and also remotely via ssh.

With v0.71, tools are recorded only locally when they are registered and the validation of remote hosts is deferred until the workload is run. During its initialization, the Tool Meister sub-system now reports when registered tools are not present on registered hosts, and, if a tool is not installed, an error message will be displayed, and the "bench-script" will exit with a failure code.

The registered tools are recorded in a local directory off of the "pbench_run" directory, by default /var/lib/pbench-agent/tools-v1-<name>, where <name> is the name of the Tool Group under which the tools were registered.

The process of registering tools on local or remote hosts no longer validates that those tools are available during tool registration. The Tool Meister sub-system now reports when registered tools are not present on registered hosts before beginning a benchmark workload. An error message will be displayed, and the particular "bench-script" will exit with a failure code.

All tools registered prior to installing v0.71 must be re-registered; tools registered locally or remotely on a host with v0.69 or earlier of the pbench-agent will be ignored.

New Support for Prometheus and PCP-based Tools

The new "Tool Meister" sub-system enables support of Prometheus and PCP-based tools for data collection.

The existing tools supported prior to the v0.71 release can be categorized as "Transient" tools. By transient we mean that a given tool is started immediately before and stopped immediately after the execution of a benchmark workload. For example, when using pbench-fio -b 4,16,32 -t read,write, the transient tools are started immediately before each fio job is executed, and stopped immediately following its completion, for each of the six fio jobs that would be run.

A new category is introduced for Prometheus and PCP called "Persistent" tools. Persistent tools are started once at the beginning of a benchmark convenience script and stopped at its end. Using the previous pbench-fio example, persistent tools would be started before any of the six pbench-fio jobs begin and would be stopped once all six end.

When persistent tools are used, data is continuously collected from the data sources ("exporters", in the case of Prometheus, and "PMCDs", in the case of PCP) and stored local to the execution of the Tool Data Sink.

Note that for transient tools, where data for the transient tool is collected locally on the host the tool is registered, the collected data is usually sent to the Tool Data Sink when the benchmark workload finishes, though in some cases the data won't be sent until the very end to avoid impacting the behavior of the benchmark workload (e.g. pbench-specjbb2005).

Prometheus tools: node-exporter and dcgm

Two new pbench "tools" have been added, node-exporter and dcgm. If either or both of these new tools is registered (e.g. via pbench-register-tool --name=node-exporter --remotes=a.example.com), then the Tool Meister sub-system will run the node_exporter code on the hosts (in this case, a.example.com) and a local instance of Prometheus to collect the data. The collected Prometheus data is stored in the pbench result directory as a tar ball at: ${pbench_run}/<script>_<config>_YYYY.MM.DDTHH.mm.ss/tools-<group>/prometheus.

...

Read more

v0.71.0-alpha.0 (agent-only)

27 Apr 12:39
Compare
Choose a tag to compare
Pre-release

This is a very significant "minor" release of the pbench-agent code base, primarily to deliver the new "Tool Meister" sub-system.

It also delivers:

  • Support for RHEL 9 & CentOS 9
  • Tool registration kept local to the host where registration happens
  • Support of Prometheus and PCP tool data collection
  • The default tool set has been reduced to iostat, sar, & vmstat tools
  • Removal of gratuitous manipulation of networking firewalls
  • Removal of gratuitous software installation, only checks for requirements
    • True for both tools and benchmark script requirements
    • Change to check command versions instead of RPM versions for pbench-fio, pbench-linpack, and pbench-uperf
  • The pbench-linpack benchmark script now provides result graphs, JSON data files, and supports execution on one or more local / remote hosts
  • Required use of --user with pbench-move-results/pbench-copy-results
  • Support for the new HTTP PUT method of posting tar balls
  • Removal of the dependency on the SCL (Software Collections Library)
  • Support for pbench-trafficgen benchmark script dropped entirely
  • Deprecation announcements for unused benchmark convenience scripts:
    • pbench-run-benchmark, pbench-cyclictest, pbench-dbench, pbench-iozone, pbench-migrate, and pbench-netperf
  • Many, many, bug fixes and behavioral improvements

You can review the Full ChangeLog on GitHub (all 550+ commits, tags b0.69-bp to v0.71.0-alpha.0), or read a summary with relevant details below.

We did not bump the "major" release version number with these changes because we still don't consider all the necessary functionality in place for such a major version bump.

Note that work on the v0.71 release started in earnest with the v0.69.3-agent release (tagged as b0.69-bp). A number of bug fixes and behaviors from the v0.71 work have already been back-ported and delivered in the various v0.69.* releases since then. These release notes will highlight only the behavioral changes that have not been back-ported previously.

Installation

There are no other installation changes in this release: see the Getting Started Guide for how to install or update.

After installation or update, you should have version 0.71.0-XXgXXXXXXXXX of the pbench-agent RPM installed.

RPMs are available from Fedora COPR, covering Fedora 35, 36, EPEL 7, 8, 9.

There are Ansible playbooks available via Ansible Galaxy to install the pbench-agent, and the pieces needed (key and configuration files) to be able to send results to a server. To use the RPMs provided above via COPR with the playbooks, an inventory file needs to include the fedoraproject_username variable set to portante, for example:

...

[servers:vars]
fedoraproject_username: portante

...

Alternatively, one can specify fedoraproject_username on the command line, rather than having it specified in the inventory file:

ansible-playbook -i <inventory> <playbook> -e '{fedoraproject_username: portante}'

NOTE WELL: If the inventory file also has a definition for pbench_repo_url_prefix (which was standard practice before fedoraproject_username was introduced), it needs to be deleted, otherwise it will override the default repo URL and the fedoraproject_username change is not going to take effect.

While we don't include installation instructions for the new node-exporter and dcgm tools in the published documentation, you can find a manual installation procedure for the Prometheus "node_exporter" and references to the Nvidia "DCGM" documentation in the agent/tool-scripts/README.

Container images built using the above RPMs are available in the Pbench organization in the Quay.io container image repository using tags beta, v0.71.0-XX, and XXXXXXXXX.

Summary of Changes

Support for RHEL 9 & CentOS 9

Support for RHEL & CentOS 9 is provided in this release. Note that since RHEL 9 has not been GA'd yet there might still be some changes that will have to be made to support it.

The New "Tool Meister" Sub-System

The "Tool Meister" sub-system (introduced by PR #1248) is the major piece of functionality delivered with the release of v0.71 of the pbench-agent.

This is a significant change, where the pbench-agent first orchestrates the instantiation of a "Tool Meister" process on all hosts registered with tools, using a Redis instance to coordinate their operation, and the new "Tool Data Sink" process handles the collection of data into the pbench run directory hierarchy. This effectively eliminates all remote SSH operations for individual tools except one per host to orchestrate the creation of the Tool Meister instance.

One Tool Meister instance is created per registered host, and then a single Tool Data Sink instance is created on the host where the benchmark script is run. The Tool Data Sink is responsible for collecting and storing locally all data sent to it from the deployed Tool Meister instances.

User Orchestration of "Tool Meister" Sub-System

Container images are provided for the constituent components of the Tool Meister sub-system, the Tool Meister image and the Tool Data Sink image. The images allow for the orchestration of the Tool Meister sub-system to be handled by the user instead of automatically by the pbench-agent.

The "Tool Meister" Sub-System with No Tools

While this is not a new feature of the Pbench Agent, it is worth noting that when no tools are registered, the "Tool Meister" sub-system is not deployed and the bench scripts still execute normally.

All Tool Registration Handled Locally

Along with the new "Tool Meister" sub-system comes another subtle, but significant, change to how tools are registered.

With the v0.71 release, the record of which tools are registered on which hosts is kept local to the host on which pbench-register-tool or pbench-register-tool-set are invoked.

Prior to v0.71, tool registration for remote hosts was recorded locally, and remotely via ssh.

The registered tools are recorded in a local directory off of the "pbench_run" directory, by default /var/lib/pbench-agent/tools-v1-<name>, where <name> is the name of the Tool Group under which the tools were registered.

The process of registering tools on local or remote hosts no longer validates that those tools are available during tool registration. The Tool Meister sub-system now reports when registered tools are not present on registered hosts before beginning a benchmark run. An error message will be displayed, and the particular "bench-script" will exit with a failure code.

All tools registered prior to installing v0.71.0-alpha must be re-registered; tools registered locally or remotely on a host with v0.69 or earlier of the pbench-agent will be ignored.

New Support for Prometheus and PCP-based Tools

The new "Tool Meister" sub-system enables support of Prometheus and PCP-based tools for data collection.

The existing tools supported prior to the v0.71 release can be categorized as "Transient" tools. By transient we mean that a given tool is started and stopped immediately around the execution of a benchmark workload. For example, when using pbench-fio -b 4,16,32 -t read,write, the transient tools are started immediately before each fio job is executed, and stopped immediately following its completion, for each of the 6 (six) fio jobs that would be run.

A new category is introduced for Prometheus and PCP called "Persistent" tools. Persistent are started once at the beginning of a benchmark script, stopped at its end. Using the previous pbench-fio example, persistent tools would be started before any of the 6 (six) pbench-fio jobs begin, and would be stopped once all six end.

When persistent tools are used, data is continuously collected from the data sources ("exporters", in the case of Prometheus, and "PMCDs", in the case of PCP) and stored local to the execution of the Tool Data Sink.

Note that for transient tools, where data for the transient tool is collected locally on the host the tool is registered, the collected data is sent to the Tool Data Sink when the benchmark script deems it won't impact behavior of the benchmark itself.

Prometheus tools: node-exporter and dcgm

Two new pbench "tools" have been added, node-exporter and dcgm. If one registers either or both of these new tools (e.g. via pbench-register-tools --name=node-exporter), then the Tool Meister sub-system will run the node_exporter code on the registered hosts, and a local instance of Prometheus to collect the data. The collected Prometheus data is stored in the pbench result directory as a tar ball at: ${pbench_run}/<script>_<config>_YYYY.MM.DDTHH.mm.ss/tools-<group>/prometheus.

For the duration of the run, the Prometheus instance is available on localhost:9090 if one desires to review the metrics being collected live.

NOTE WELL: like all the other "tools" the pbench-agent supports, the node-exporter and dcgm tools themselves need to be installed separately on the registered hosts.

The new dcgm tool requires Python 2, an Nvidia based install which might conflict with the Pbench Agent's Python 3 operational requirement in some cases.

The PCP tool

Just like the new Prometheus based tools, you can register "PCP" as a persistent tool using: `pbench-register-tool -...

Read more

v0.71.0-qe.02 (agent-only)

11 Apr 17:27
Compare
Choose a tag to compare
Pre-release
Removed package-lock.json

v0.69.10 (agent-only)

08 Apr 18:43
Compare
Choose a tag to compare
v0.69.10 (agent-only) Pre-release
Pre-release

This is an agent-only release, changing pbench-trafficgen to use the bench-trafficgen repo.

Beyond the pbench-trafficgen work, we also have fixes for pbench-specjbb recorded metadata, improved error handling for pbench-move/copy-results, and documentation for the pbench-clear-tools -r option.

This release also includes server-side commits which will not be released in an RPM.

What's Changed (Agent)

What's Changed (Server)

Server side commits which will not show up in an RPM.

Full Changelog: v0.69.9...v0.69.10