Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apis: add MetricPrediction crd #1875

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zwzhang0107
Copy link
Contributor

@zwzhang0107 zwzhang0107 commented Jan 29, 2024

Ⅰ. Describe what this PR does

define metric prediction crd for recommendation and predction.

the following yaml defines a MetricPrediction called mericprediction-sample.

mericprediction-sample spec claims that it needs the resource prediction for a workload.

  • prediction is on container level for a Deployment called nginx
  • metric types are cpu and memory from metric-server
  • using distribution profiler which statistic its history usage

mericprediction-sample status returns the cpu and memory profiling result for all containers of nginx workload.

apiVersion: analysis.koordinator.sh/v1alpha1
kind: MetricPrediction
metadata:
  name: mericprediction-sample
  namespace: default
spec:
  target:
    type: workload
    workload:
      apiVersion: apps/v1
      kind: Deployment
      name: nginx
      hierarchy:
        level: container
  metric:
    source: metricServer
    metricServer:
      resources: [cpu, memory]
  profilers:
  - name: recommendation-sample
    model: distribution
    distribution:
      # args
status:
  results:
  - profilerName: recommendation-sample
    model: distribution
    distributionResult:
      items:
      - id:
          level: container
          name: nginx-container
        resources:
        - name: cpu
          avg: 6850m
          quantiles:
            # ...
            p95: 7950m
            p99: 8900m
          stdDev: 759m
          firstSampleTime: 2024-01-29T07:15:56Z
          lastSampleTime: 2024-01-30T07:15:56Z
          totalSamplesCount: 10000
          updateTime: 2024-01-30T07:16:56Z
          conditions: []
        - name: memory
          avg: 1000Mi
          quantiles:
            # ...
            p95: 1100Mi
            p99: 1200Mi
          stdDev: 100Mi
          firstSampleTime: 2024-01-29T07:15:56Z
          lastSampleTime: 2024-01-30T07:15:56Z
          totalSamplesCount: 10000
          updateTime: 2024-01-30T07:16:56Z
          conditions: []

Ⅱ. Does this pull request fix one issue?

more infos can get from #1880

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

Intergrating with Metric Predition Framework
Metric Prediction Framework is kind of a "deep module", providing algorithms and prediction models in backend. There could be multiple profilers built with Metric Prediction as a foundation. Here are some scenarios about how to use the framework.

  • Resource Recommender for Workload
    The spec of Recommendation defines it needs the recommended resources(CPU and memory) for a deployment named nginx-sample, and the recommendResources in status show the result for each container.
apiVersion: analysis.koordinator.sh/v1alpha1
kind: Recommendation
metadata:
  name: recommendation-sample
  namespace: recommender-sample
spec:
  workloadRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-sample
status:
  recommendResources:
    containerRecommendations:
    - containerName: nginx-container
      target:
        cpu: 4742m
        memory: 262144k

The recommendation is calculated with quantile value of history metrics. If the Using the Metric Prediciton as profiling model, the requirement of recommendation-sample can be expressed in MetricPrediction.
For different kind of workload, the recommendation can select specified quantile value from MetricPrediction, for example p95 for Deployment and average for Job, then increase with a 10–15% margin for safety.

apiVersion: analysis.koordinator.sh/v1alpha1
kind: MetricPrediction
metadata:
  name: mericprediction-sample
  namespace: default
spec:
  target:
    type: workload
    workload:
      apiVersion: apps/v1
      kind: Deployment
      name: nginx-sample
      hierarchy:
        level: container
  metric:
    source: metricServer
    metricServer:
      resources: [cpu, memory]
  profilers:
  - name: recommendation-sample
    model: distribution
    distribution:
      # args
  • Hotspot Prediction by Timeseries Metric
    Pod orchestration varies over time on node, and each pod has its own cycle on resource usage. The NodeQoS CR below describe the usage prediction according to workload metric prediction based on time series.
apiVersion: analysis.koordinator.sh/v1alpha1
kind: NodeQoS
metadata:
  name: node-sample
spec:
  usagePredictionPolicy: workloadByTime
status:
  usageOverTtime:
  - timeWindow: "0~1" # 1~2 hour
    max:
      cpu: 6039m
      memory: 18594k
    average:
      cpu: 4028m
      memory: 15782k
    p95:
      cpu: 5731m
      memory: 18043k
  - timeWindow: "1~2" # 1~2 hour
    max:
      cpu: 6039m
      memory: 18594k
    average:
      cpu: 4028m
      memory: 15782k
    p95:
      cpu: 5731m
      memory: 18043k

The usageOverTtime result in Node QoS is aggregated from MetricPredicion of all workloads running on the Node now, so that the descheduler can check whether there are nodes overloaded in near future then rebalance some pods to others.

apiVersion: analysis.koordinator.sh/v1alpha1
kind: MetricPrediction
metadata:
  name: mericprediction-sample
  namespace: default
spec:
  target: # workload
  metric:
    source: metricServer
    metricServer:
      resources: [cpu, memory]
    prometheus:
    - resource: memoryBandwidth
      name: container_memory_bandwidth
  profilers:
  - name: timeseries-sample
    model: timeseries-trend
    timeseries-trend: # args
  • Interference Detection for Workload Otliers
    Pod may got interference during runtime due to the resource contention on node, which can be analysed through CPI, PSI, CPU schedule latency etc. Specifiy algorithm such as OCSVM in MetricPrediction then the model will be available in status.
apiVersion: analysis.koordinator.sh/v1alpha1
kind: MetricPrediction
metadata:
  name: mericprediction-sample
  namespace: default
spec:
  target: # workload
  metric:
    prometheus:
    - resource: cpi
      name: container_cpi
    - resource: psi_cpu
      name: container_psi_cpu
    - resource: csl
      name: container_cpu_scheduling_latency
  profilers:
  - name: interference-sample
    model: OCSVM
    ocsvm: # args

The Interference Manager will parse and send the corresonding model of workload to koordlet. koordlet will execute QoS strategies once it finds some pod is an outlier according to recent metrics.

koordetector

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

Copy link

codecov bot commented Jan 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.54%. Comparing base (07e51fa) to head (4531674).
Report is 117 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1875      +/-   ##
==========================================
+ Coverage   67.23%   67.54%   +0.30%     
==========================================
  Files         410      413       +3     
  Lines       45662    46072     +410     
==========================================
+ Hits        30702    31120     +418     
+ Misses      12742    12696      -46     
- Partials     2218     2256      +38     
Flag Coverage Δ
unittests 67.54% <ø> (+0.30%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hormes after the PR has been reviewed.
You can assign the PR to them by writing /assign @hormes in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saintube
Copy link
Member

Ⅰ. Describe what this PR does

define metric prediction crd for recommendation and predction.

the following yaml defines a MetricPrediction called mericprediction-sample.

mericprediction-sample spec claims that it needs the resource prediction for a workload.

  • prediction is on container level for a Deployment called nginx
  • metric types are cpu and memory from metric-server
  • using distribution profiler which statistic its history usage

mericprediction-sample status returns the cpu and memory profiling result for all containers of nginx workload.

apiVersion: analysis.koordinator.sh/v1alpha1
kind: MetricPrediction
metadata:
  name: mericprediction-sample
  namespace: default
spec:
  target:
    type: workload
    workload:
      apiVersion: apps/v1
      kind: Deployment
      name: nginx
      hierarchy:
        level: container
  metric:
    source: metricServer
    metricServer:
      resources: [cpu, memory]
  profilers:
  - name: recommendation-sample
    model: distribution
    distribution:
      # args
status:
  results:
  - profilerName: recommendation-sample
    model: distribution
    distributionResult:
      items:
      - id:
          level: container
          name: nginx-container
        resources:
        - name: cpu
          avg: 6850m
          quantiles:
            # ...
            p95: 7950m
            p99: 8900m
          stdDev: 759m
          firstSampleTime: 2024-01-29T07:15:56Z
          lastSampleTime: 2024-01-30T07:15:56Z
          totalSamplesCount: 10000
          updateTime: 2024-01-30T07:16:56Z
          conditions: []
        - name: memory
          avg: 1000Mi
          quantiles:
            # ...
            p95: 1100Mi
            p99: 1200Mi
          stdDev: 100Mi
          firstSampleTime: 2024-01-29T07:15:56Z
          lastSampleTime: 2024-01-30T07:15:56Z
          totalSamplesCount: 10000
          updateTime: 2024-01-30T07:16:56Z
          conditions: []

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

typo: mericprediction -> metricprediction

@hormes
Copy link
Member

hormes commented Jan 31, 2024

Add some user stories to help understand how the API is used

@zwzhang0107
Copy link
Contributor Author

corresonding

udpated with more user stories.

Copy link
Member

@saintube saintube left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

apis/analysis/v1alpha1/condition.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/groupversion_info.go Show resolved Hide resolved
apis/analysis/v1alpha1/metric_spec.go Outdated Show resolved Hide resolved
// API version of the referent
APIVersion string `json:"apiVersion,omitempty"`
// Hierarchy indicates the hierarchy of the target for profiling
Hierarchy ProfileHierarchy `json:"hierarchy,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WorkloadRef makes sense, but the field Hierarchy doesn't look connected to the workload reference, is the definition really appropriate here? I noticed that PodSelectorRef has the same situation, so Hierarchy itself is a field that needs to be described independently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Hierarchy field is related to Workload, which means the metric prediction is container or pod level , which is only effective for K8s workloads.
Metric Prediction can also work for other types of workloads beyond K8s such as a FaaS job. Although the workload definition is not is K8s, the metric is recorded as prometheus format.

function_cpu_usage{service="service-word-count", function="f-map-name", slice="slice-0"} 1.2
function_cpu_usage{service="service-word-count", function="f-map-name", slice="slice-1"} 1.3
function_cpu_usage{service="service-word-count", function="f-reduce-name", slice="all"} 2

It means a workload called(service-word-count), which is consisted of two jobs(map and reduce).
For resource recommendation scenario, we want to profile service-word-count/f-map-name and service-word-count/f-reduce-name.
This workload can be defined as AnalysisTargetPromethuesLabelGroup, and we should support service/function as the key for aggregation in PromethuesLabelGroup, which does not need Hierarchy field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information looks important and should be added to the proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can add more explanation when we support PromethuesLabelGroup.

@hormes
Copy link
Member

hormes commented Feb 1, 2024

In the case where there is another layer in the usage scenario mentioned earlier, does MetricPrediction need to be a CRD?

apis/analysis/v1alpha1/metric_spec.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/metric_spec.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/metric_spec.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/condition.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/metric_spec.go Outdated Show resolved Hide resolved
// PrometheusMetric defines the prometheus metric to be analyzed
type PrometheusMetric struct {
// Resource defines the key of resource to be analyzed
Resource v1.ResourceName `json:"name,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the name resource is suitable here. For example, is CPI a resource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just defined as name here

apis/analysis/v1alpha1/profiler.go Outdated Show resolved Hide resolved
apis/analysis/v1alpha1/profiler.go Show resolved Hide resolved
@zwzhang0107
Copy link
Contributor Author

zwzhang0107 commented Feb 20, 2024

In the case where there is another layer in the usage scenario mentioned earlier, does MetricPrediction need to be a CRD?

@hormes The Recommendation controller in Koordinator does not need create MetricPrediction CR in APIServer, which means the MetricPrediction is a internal protocol in this scenario, converting Recommendation CR to MetricPrediction INTERNAL for framework.

In the following scenarios MetricPrediction CR will be created:

  • An external controller want to use the Prediction module of Koordiantor, then MetricPrediction CRD acts as an API between the external controller and Koordiantor.
  • Before developing a new profiler controller, MetricPrediction will be created for performing experiments and demo before implementation. For example we need to compare whether to use ARIMA or Prophet algorithm in NodeQoS controller.

First we will support usage scenario, and the development will take two steps:

  • MetricPrediction framework with Distribution model, useing the resource prediction scenario to verify the framework works well. Then the framework can be extended with more algorithm models such as Interference Detection.
  • Recommendation controller based on MetricPrediction framework, considering workload type(Job/Service), OOM event, etc.

Signed-off-by: 佑祎 <zzw261520@alibaba-inc.com>
@koordinator-bot
Copy link

New changes are detected. LGTM label has been removed.

@koordinator-bot koordinator-bot bot removed the lgtm label Feb 20, 2024
// Source defines the source of metric, which can be metric server or prometheus
Source MetricSourceType `json:"source"`
// MetricServer defines the metric server source, which is effective when source is metric server
MetricServer *MetricServerSource `json:"metricServer,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MetricServer *MetricServerSource `json:"metricServer,omitempty"`
MetricsAPI *MetricsAPIMetricSource `json:"metricsAPI,omitempty"`

@zwzhang0107
Copy link
Contributor Author

/hold until we have implemented first user strory

Copy link

stale bot commented Jun 2, 2024

This issue has been automatically marked as stale because it has not had recent activity.
This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the issue is closed
    You can:
  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Close this issue or PR with /close
    Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants