Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't query datadog logs using analysis template #2628

Open
joey100 opened this issue Feb 27, 2023 · 4 comments · May be fixed by #3560
Open

Can't query datadog logs using analysis template #2628

joey100 opened this issue Feb 27, 2023 · 4 comments · May be fixed by #3560
Labels
analysis Related to Analysis CRD bug Something isn't working no-issue-activity

Comments

@joey100
Copy link

joey100 commented Feb 27, 2023

Checklist:

  • [ YES] I've included steps to reproduce the bug.
  • [ YES] I've inclued the version of argo rollouts.

Describe the bug

When we use analysis template to query datadog logs, even though the analysisrun is running, the value is '[]', while in datadog we do see the results.
When we use analysis template to query datadog metrics, it works fine, the analysisrun runs and can get the expected values.

To Reproduce

  1. Define an analysistemplate like below:
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: experimental-test-error-count
spec:
  args:
  - name: service-name
  - name: env
  - name: version
  metrics:
  - name: latency
    interval: 30s 
    successCondition: default(result, 0) <= 100
    failureLimit: 1 
    provider:
      datadog:
        interval: 5m
        query: |
          logs("service:experimental-bff status:error").index("*").rollup("count")
  1. Run the rollout with the template.
  2. The analysisrun result is below:
spec:
  metrics:
  - failureLimit: 1
    interval: 30s
    name: latency
    provider:
      datadog:
        interval: 5m
        query: |
          logs("service:experimental-bff status:error").index("*").rollup("count")
    successCondition: default(result, 0) <= 100
  terminate: true
status:
  dryRunSummary: {}
  message: Run Terminated
  metricResults:
  - count: 6
    measurements:
    - finishedAt: "2023-02-27T10:33:04Z"
      phase: Successful
      startedAt: "2023-02-27T10:33:03Z"
      value: '[]'
    - finishedAt: "2023-02-27T10:33:34Z"
      phase: Successful
      startedAt: "2023-02-27T10:33:34Z"
      value: '[]'
    - finishedAt: "2023-02-27T10:34:04Z"
      phase: Successful
      startedAt: "2023-02-27T10:34:04Z"
      value: '[]'
    - finishedAt: "2023-02-27T10:34:34Z"
      phase: Successful
      startedAt: "2023-02-27T10:34:34Z"
      value: '[]'
    - finishedAt: "2023-02-27T10:35:04Z"
      phase: Successful
      startedAt: "2023-02-27T10:35:04Z"
      value: '[]'
    - finishedAt: "2023-02-27T10:35:34Z"
      phase: Successful
      startedAt: "2023-02-27T10:35:34Z"
      value: '[]'
    name: latency
    phase: Successful
    successful: 6
  phase: Successful
  runSummary:
    count: 1
    successful: 1
  startedAt: "2023-02-27T10:33:04Z"

Expected behavior

The analysisrun could query datadog logs successfully, with the correct logs result but not the nil result.

Screenshots

analysisrun-result
datadog-result

Version

1.4

Logs

# Paste the logs from the rollout controller

# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts

# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@joey100 joey100 added the bug Something isn't working label Feb 27, 2023
@alonbehaim
Copy link

@joey100 seems that currently only metrics supported as you can see here
Anyway I can recommend you to move to use datadog api v2 based on api/v2/query/timeseries

You can also create logs pipeline in datadog to generate metrics and use them, maybe it should be feature request to support logs.

@joey100
Copy link
Author

joey100 commented Mar 14, 2023

@joey100 seems that currently only metrics supported as you can see here Anyway I can recommend you to move to use datadog api v2 based on api/v2/query/timeseries

You can also create logs pipeline in datadog to generate metrics and use them, maybe it should be feature request to support logs.

Got it, thanks.

@deadlysyn
Copy link

deadlysyn commented Apr 14, 2023

we found the same issue where metrics is hard coded. still thinking of creative workarounds (thanks for the pipeline idea!), but unless there is a technical reason to avoid it supporting logs, apm/spans, etc. query types would be a useful feature.

trying to use apiVersion: v2 in a ClusterAnalysisTemplate with argo-rollouts:latest gives error despite seeming to match docs and code clearly supporting it. 🤔

image

apiVersion: argoproj.io/v1alpha1
kind: ClusterAnalysisTemplate
metadata:
  name: error-rate
spec:
  args:
  - name: dd-service-name
  metrics:
  - name: error-rate
    interval: 1m
    successCondition: default(result, 0) < 1
    failureLimit: 1
    provider:
      datadog:
        apiVersion: v2
        interval: 1m
        query: |
          avg(last_1h):anomalies(sum:trace.graphql.execute.errors{cluster_name:foobah-eks-2022-09,env:production,service:{{args.dd-service-name}}}, 'robust', 4, direction='above', alert_window='last_5m', interval=1, count_default_zero='true')

just removing apiVersion works fine. what am i missing?

apiVersion: v2 is only supported on 1.5.0-rc1.

@kostis-codefresh kostis-codefresh added the analysis Related to Analysis CRD label Jun 7, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2023

This issue is stale because it has been open 60 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Related to Analysis CRD bug Something isn't working no-issue-activity
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants