[dy] Record memory and cpu usage for pipeline runs #4761
+124
−11
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds memory and cpu tracking for pipeline runs in the backend. There will be a follow up PR to surface the metrics in the frontend.
Calculating memory is relatively straightforward, we can get the memory usage for all processes for a pipeline run. To get the memory usage per pipeline, we can use
psutil.Process.memory_percent()
. CPU can be done in a similar way, but getting the cpu usage percentage per process is more complicated. The comments in the code explains more in depth how it's implemented, but basically we need to track the cpu usage at various points in time and compare the numbers to get the cpu usage percent.Right now, we get memory and cpu usage every time a heartbeat is run for the pipeline run, so roughly every 10 seconds. The usage is stored in the
PipelineRun.metrics
field. The pipeline run cpu and memory usage will also be included in the tags for the heartbeat log.Example:
How Has This Been Tested?
Checklist
docs/mint.json
cc: @wangxiaoyou1993