Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of LevelZero metric support to the LevelZero device pool and IO Group #2897

Open
wants to merge 24 commits into
base: dev
Choose a base branch
from

Conversation

lhlawson
Copy link
Contributor

@lhlawson lhlawson commented Mar 15, 2023

The addition of ZET metric polling capability to the LevelZero IO Group, Device Pool, and API shim.

General Metric Streamer flowchart implemented follows the LevelZero approach shown here

Details of change:

  • Updated LevelZero.cpp to set ZET environment variables and follow the general time based metric query approach outlined in the LevelZero Tools API.
  • Added querying for the 'ComputeBasics' group of Intel Hardware. An additional metric group may be queried but is not supported at this time.

Additional Tasks:

  • Define Default LevelZero Tools report/event rate based on workload performance
  • Add Test for signal and control of LevelZero Tools.
  • Later PR: Consider adding LEVELZERO::METRICS:XVE_*:AGGREGATION. Currently all signals are averaged over the number of reports received, but this could be modified to be AVG, LAST, MIN, MAX, etc
  • Test dynamic tracing with geopm controller (read_signal and batch read have been tested, but not with the controller)
  • Add multi-tile support
  • Re-implement metric_destroy/teardown mechanism
  • Remove geopmread functionality until an appropriate tear-down handling with read_batch and read_signal can occur.
  • Later PR: Add 'check and fail' if the metric group is changed out from under the user

@@ -19,11 +19,13 @@
static void __attribute__((constructor)) geopm_levelzero_init(void)
{
setenv("ZES_ENABLE_SYSMAN", "1", 1);
// setenv("ZET_ENABLE_METRICS", "1", 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// setenv("ZET_ENABLE_METRICS", "1", 1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-enabled ZET by default.

for (int subdevice_idx = 0;
subdevice_idx < m_devices.at(device_idx).m_num_subdevice;
++subdevice_idx) {
//assume false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to just be truth rather than assumption at this point, since it isn't cached or initialized yet


std::vector<zet_metric_group_handle_t> metric_group_handle(num_metric_group);

//TODO: save the relevant per GPU pointers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which pointers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO removed, this was a hold over from when the metric group handle, context, etc were not being stored in the tracking struct.

__LINE__);

//sampling period in nanoseconds
m_devices.at(device_idx).metric_sampling_period = 2000000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optionally rename to make sure the units are known everywhere this is used. E.g., sampling_period_ns

@@ -155,6 +188,10 @@ namespace geopm

std::vector<ze_driver_handle_t> m_levelzero_driver;
std::vector<m_device_info_s> m_devices;

void metric_group_cache(unsigned int l0_device_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a comment explaining what this is caching and/or what data or functions are invalid until this is cached. My read-through understanding is:

  • Caches to several parts of m_devices. I guess this function needs to be called after initialization and before attempting to request any metric data.
  • Caches the names of metrics in the ComputeBasic L0 metric group.

Maybe rename this to indicate that it is hard-coded to computebasic. Or parameterize the computebasic part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or docs for the other functions might cover that (e.g., they say they'll throw if not yet cached)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added

ze_event_pool_handle_t event_pool_handle = nullptr;
ze_event_pool_desc_t event_pool_desc = {ZE_STRUCTURE_TYPE_EVENT_POOL_DESC, nullptr, 0, 1};

ze_result = zeEventPoolCreate(context, &event_pool_desc, 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Knowledge check for myself. Docs say 1 here is the number of devices. Is this 1 because we're initializing one GPU at a time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading further down I guess it also means we're sampling one device at a time, since this is the event that read attempts block on while waiting for metric data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're targeting 1 event in each event pool, and 1 sub-device per event.

zet_metric_streamer_desc_t metric_streamer_desc = {
ZET_STRUCTURE_TYPE_METRIC_STREAMER_DESC,
nullptr,
32768,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this number of reports per notification?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 4 reports (1ms per report target)

service/src/LevelZero.cpp Outdated Show resolved Hide resolved
m_devices.at(l0_device_idx).subdevice.metric_streamer.at(l0_domain_idx));
}
else {
metric_read(l0_device_idx, l0_domain_idx); //TODO: possible infinite loop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a timeout mechanism? E.g., change it to a loop around line 858 and trigger an error if too many attempts or too much time has passed without the event

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add a non-zero timeout to the last arg in zeEventHostSynchronize if going that route

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recursive call is removed in the most recent version. In cases where the event is not ready the old data will be provided to the user.
We may want to add a 'collected' timestamp so the user can identify if this happens.

Comment on lines 860 to 864
if (ze_host_result != ZE_RESULT_NOT_READY) {
metric_calc(l0_device_idx, l0_domain_idx,
m_devices.at(l0_device_idx).subdevice.metric_streamer.at(l0_domain_idx));
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be something like if == NOT_READY then retry else if SUCCESS then calc() else error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry logic removed in most recent version.

service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.hpp Outdated Show resolved Hide resolved
Comment on lines 531 to 532
subdevice_idx < m_devices.at(device_idx).m_num_subdevice;
++subdevice_idx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please fix indentation

service/src/LevelZeroImp.hpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
for (unsigned int report_idx = 0; report_idx < num_reports; report_idx++)
{
//This is the actual gathering of the data
zet_typed_value_t data = metric_values.at(report_idx*num_metric + metric_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferably, it would be better if a data structure was used that more naturally represented the data being stored. If it's not convenient for whatever reason, please add a comment explaining this formula and how metric_values is structured such that this formula correctly indexes into the data item of interest.

I'm a bit nervous that report_idx depends on num_reports which is num_metric_values / num_metric. Does this formula take into account what happens when num_metric_values is not divisible by num_metric? If needed, please add checks and/or assertions to ensure any preconditions needed for this formula to work.

Comment on lines 845 to 846
m_devices.at(l0_device_idx).subdevice.m_metric_data.at(l0_domain_idx).at("XVE_ACTIVE") = {};
m_devices.at(l0_device_idx).subdevice.m_metric_data.at(l0_domain_idx).at("XVE_STALL") = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider putting these strings (the metric names) in symbolic constants, as they're used in multiple places (including the IOGroup).

@lhlawson lhlawson force-pushed the public-lhlawson-zet-support branch from 9de489f to 8ead8a9 Compare March 27, 2023 23:12
@lhlawson lhlawson marked this pull request as ready for review March 29, 2023 01:20

#include "LevelZeroImp.hpp"

static void __attribute__((constructor)) geopm_levelzero_init(void)
{
setenv("ZES_ENABLE_SYSMAN", "1", 1);
setenv("ZET_ENABLE_METRICS", "1", 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set conditionally somehow?

If it's too late to set this when push_signal() is called with metrics based signals/controls, then I think this has to be done at compile time with a #define.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're setting this at all times, but having the actual usage of ZET (and context running on the GPU) be conditional on the user doing a push signal of a ZET metric and the first read of those metrics.

Comment on lines -254 to +264
const LevelZero &levelzero();
LevelZero &levelzero();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was required for the lazy init approach being used to update some level zero metric containers. After more recent code changes it might not be required.

service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved
service/src/LevelZero.cpp Outdated Show resolved Hide resolved

zet_metric_streamer_desc_t metric_streamer_desc = {
ZET_STRUCTURE_TYPE_METRIC_STREAMER_DESC,
nullptr,
32768,
m_devices.at(l0_device_idx).metric_sampling_period};
4, // number of reports to notify on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as before: why 4 reports at 1 ms per sample? Is that something to do with our typical 5ms agent loops?

service/src/LevelZero.cpp Outdated Show resolved Hide resolved
dannosliwcd
dannosliwcd previously approved these changes Apr 13, 2023
Copy link
Contributor

@avilcheslopez avilcheslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing multiple "TODO" comments, which makes it seem like this is still in draft form. Which of those TODOs need to be addressed as part of this PR?

// Allocate memory for future reads
size_t data_size = 0;
ze_result = zetMetricStreamerReadData(m_devices.at(l0_device_idx).subdevice.metric_streamer.at(l0_domain_idx),
UINT32_MAX, &data_size, nullptr); //TODO: this value should match the report_count_req in metric_read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alignment

Comment on lines +191 to +185
subdevice_idx < m_devices.at(gpu_idx).num_subdevice;
++subdevice_idx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fix indentation


std::string metric_name (metric_properties.name);

//TODO: check timing impact of only processing metrics supported by IOGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO for this PR?

": LevelZero Metric property acquisition failed",
__LINE__);

std::string metric_name (metric_properties.name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
std::string metric_name (metric_properties.name);
std::string metric_name(metric_properties.name);

metric_name == "XVE_STALL") {
for (unsigned int report_idx = 0; report_idx < num_reports; report_idx++) {
// Each report contains num_metric entries of data. We need to access
// this metric (metric_idx) from each report ( report_idx * num_metric)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// this metric (metric_idx) from each report ( report_idx * num_metric)
// this metric (metric_idx) from each report (report_idx * num_metric)


for (unsigned int metric_idx = 0; metric_idx < num_metric; metric_idx++)
{
//TODO: It is possible that simply parsing all the metrics is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO for this PR?

": LevelZero Event Create failed",
__LINE__);

// TODO: nothing guarantees CHIP 0 was called first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:o

zet_metric_streamer_desc_t metric_streamer_desc = {
ZET_STRUCTURE_TYPE_METRIC_STREAMER_DESC,
nullptr,
4, // number of reports to notify on. Targeting 4 reports, 1 per millisecond as that will, generally,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better to place in a symbolic constant, to make it easier to update when needed (as I know you've had to test different values), and make the code easier to read.

Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
lhlawson and others added 18 commits February 7, 2024 13:54
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Co-authored-by: Brad Geltz <brgeltz@gmail.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
…read of METRIC signal

Signed-off-by: lhlawson <lowren.h.lawson@intel.com>
Signed-off-by: Alejandro Vilches <alejandro.vilches@intel.com>
Signed-off-by: Alejandro Vilches <alejandro.vilches@intel.com>
Signed-off-by: Alejandro Vilches <alejandro.vilches@intel.com>
Signed-off-by: Alejandro Vilches <alejandro.vilches@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants