Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DBM] AWS RDS MySQL 8 replica active connections dominated by replica workers waiting for events #17358

Open
rahul342 opened this issue Apr 5, 2024 · 0 comments

Comments

@rahul342
Copy link

rahul342 commented Apr 5, 2024

Note: If you have a feature request, you should contact support so the request can be properly tracked.

Output of the info page
Cluster agent

 datadog-cluster-agent status
Getting the status from the agent.
2024-04-05 08:14:28 UTC | CLUSTER | WARN | (pkg/util/log/log.go:666 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2024-04-05 08:14:28 UTC | CLUSTER | INFO | (pkg/util/log/log.go:626 in func1) | 2 Features detected from environment: kubernetes,orchestratorexplorer

===============================
Datadog Cluster Agent (v7.50.3)
===============================

  Status date: 2024-04-05 08:14:28.144 UTC (1712304868144)
  Agent start: 2024-04-05 00:12:57.172 UTC (1712275977172)
  Pid: 1
  Go Version: go1.20.12
  Build arch: amd64
  Agent flavor: cluster_agent
  Check Runners: 4
  Log Level: INFO

  Paths
  =====
    Config File: /etc/datadog-agent/datadog-cluster.yaml
    conf.d: /etc/datadog-agent/conf.d

  Clocks
  ======
    System time: 2024-04-05 08:14:28.144 UTC (1712304868144)

  Hostnames
  =========
    ec2-hostname: ip-172-28-98-249.us-west-2.compute.internal
    host_aliases: [i-0fa10095f5a63f1d3]
    hostname: i-0fa10095f5a63f1d3
    instance-id: i-0fa10095f5a63f1d3
    socket-fqdn: datadog-helm-cluster-agent-6757d59b55-zc4c4
    socket-hostname: datadog-helm-cluster-agent-6757d59b55-zc4c4
    hostname provider: aws
    unused hostname providers:
      'hostname' configuration/environment: hostname is empty
      'hostname_file' configuration/environment: 'hostname_file' configuration is not enabled
      azure: azure_hostname_style is set to 'os'
      fargate: agent is not runnning on Fargate
      fqdn: FQDN hostname is not usable
      gce: unable to retrieve hostname from GCE: GCE metadata API error: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname
      os: OS hostname is not usable

  Metadata
  ========

Leader Election
===============
  Leader Election Status:  Running
  Leader Name is: datadog-helm-cluster-agent-6757d59b55-zrtbl
  Last Acquisition of the lease: Thu, 04 Apr 2024 19:31:38 UTC
  Renewed leadership: Fri, 05 Apr 2024 08:14:24 UTC
  Number of leader transitions: 376 transitions

Custom Metrics Server
=====================

  Data sources
  ------------
  URL: https://api.datadoghq.com


  External metrics provider uses DatadogMetric - Check status directly from Kubernetes with: `kubectl get datadogmetric`


Cluster Checks Dispatching
==========================
  Status: Follower, redirecting to leader at 172.28.178.253

Admission Controller
====================
  Disabled: The admission controller is not enabled on the Cluster Agent


=========
Collector
=========

  Running Checks
  ==============

    kubernetes_apiserver
    --------------------
      Instance ID: kubernetes_apiserver [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_apiserver.d/conf.yaml.default
      Total Runs: 1,926
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:14:13 UTC (1712304853000)
      Last Successful Execution Date : 2024-04-05 08:14:13 UTC (1712304853000)


    kubernetes_state_core
    ---------------------
      Instance ID: kubernetes_state_core:f0ece86b2bc4e82e [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default
      Total Runs: 1,926
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:14:20 UTC (1712304860000)
      Last Successful Execution Date : 2024-04-05 08:14:20 UTC (1712304860000)


    orchestrator
    ------------
      Instance ID: orchestrator:c640d4e943da6c1d [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/orchestrator.d/conf.yaml.default
      Total Runs: 2,889
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:14:22 UTC (1712304862000)
      Last Successful Execution Date : 2024-04-05 08:14:22 UTC (1712304862000)

=========
Forwarder
=========

  Transactions
  ============
    Cluster: 0
    ClusterRole: 0
    ClusterRoleBinding: 0
    CronJob: 0
    CustomResource: 0
    CustomResourceDefinition: 0
    DaemonSet: 0
    Deployment: 0
    Dropped: 0
    HighPriorityQueueFull: 0
    HorizontalPodAutoscaler: 0
    Ingress: 0
    Job: 0
    Namespace: 0
    Node: 0
    OrchestratorManifest: 0
    PersistentVolume: 0
    PersistentVolumeClaim: 0
    Pod: 0
    ReplicaSet: 0
    Requeued: 0
    Retried: 0
    RetryQueueSize: 0
    Role: 0
    RoleBinding: 0
    Service: 0
    ServiceAccount: 0
    StatefulSet: 0
    VerticalPodAutoscaler: 0

  Transaction Successes
  =====================
    Total number: 3853
    Successes By Endpoint:
      check_run_v1: 1,926
      intake: 1
      series_v2: 1,926

  On-disk storage
  ===============
    On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.

==========
Endpoints
==========
  https://app.datadoghq.com - API Key ending with:
      - 401b3


=============
Autodiscovery
=============
  Enabled Features
  ================
    kubernetes
    orchestratorexplorer

=====================
Orchestrator Explorer
=====================
  Collection Status: Clusterchecks are activated but still warming up, the collection could be running on CLC Runners. To verify that we need the clusterchecks to be warmed up.
  Cluster Name: eks-production-b
  Cluster ID: aa749ca9-4cbe-4505-a479-3292daae5092
  Container scrubbing: enabled
  Manifest collection: enabled

  ======================
  Orchestrator Endpoints
  ======================
    https://orchestrator.datadoghq.com - API Key ending with: 401b3

  Status: Follower, cluster agent leader is: datadog-helm-cluster-agent-6757d59b55-zrtbl

Node agent

gent status
2024-04-05 08:19:20 UTC | CORE | WARN | (pkg/util/log/log.go:666 in func1) | Deactivating Autoconfig will disable most components. It's recommended to use autoconfig_exclude_features and autoconfig_include_features to activate/deactivate features selectively
2024-04-05 08:19:20 UTC | CORE | WARN | (pkg/config/config.go:1602 in LoadCustom) | Unknown key in config file: runtime_security_config.network.enabled
2024-04-05 08:19:20 UTC | CORE | WARN | (pkg/config/config.go:1602 in LoadCustom) | Unknown key in config file: runtime_security_config.activity_dump.cgroup_wait_list_size
2024-04-05 08:19:20 UTC | CORE | WARN | (pkg/config/config.go:1602 in LoadCustom) | Unknown key in config file: runtime_security_config.activity_dump.path_merge.enabled
2024-04-05 08:19:20 UTC | CORE | WARN | (pkg/config/config.go:1602 in LoadCustom) | Unknown key in config file: runtime_security_config.syscall_monitor.enabled
2024-04-05 08:19:20 UTC | CORE | WARN | (cmd/system-probe/config/adjust.go:143 in deprecateCustom) | configuration key `runtime_security_config.activity_dump.cgroup_dump_timeout` is deprecated, use `runtime_security_config.activity_dump.dump_duration` instead
Getting the status from the agent.


===============
Agent (v7.50.3)
===============

  Status date: 2024-04-05 08:19:20.888 UTC (1712305160888)
  Agent start: 2024-04-02 16:20:55.25 UTC (1712074855250)
  Pid: 24724
  Go Version: go1.20.12
  Python Version: 3.9.18
  Build arch: amd64
  Agent flavor: agent
  Check Runners: 6
  Log Level: INFO

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -20µs
    System time: 2024-04-05 08:19:20.888 UTC (1712305160888)

  Host Info
  =========
    bootTime: 2024-04-02 16:14:11 UTC (1712074451000)
    hostId: ec2370f1-d1cc-356b-6927-3d27197d54db
    kernelArch: x86_64
    kernelVersion: 5.4.226-129.415.amzn2.x86_64
    os: linux
    platform: amazon
    platformFamily: rhel
    platformVersion: 2
    procs: 847
    uptime: 6m46s

  Hostnames
  =========
    cluster-name: eks-production-b
    ec2-hostname: ip-172-28-104-77.us-west-2.compute.internal
    host_aliases: [i-02e5411f76cd87b64 ip-172-28-104-77.us-west-2.compute.internal-eks-production-b]
    hostname: i-02e5411f76cd87b64
    instance-id: i-02e5411f76cd87b64
    socket-fqdn: datadog-helm-2cwhb
    socket-hostname: datadog-helm-2cwhb
    host tags:
      cluster_name:eks-production-b
      env:production
      kube_cluster_name:eks-production-b
      kube_node:ip-172-28-104-77.us-west-2.compute.internal
    hostname provider: aws
    unused hostname providers:
      'hostname' configuration/environment: hostname is empty
      'hostname_file' configuration/environment: 'hostname_file' configuration is not enabled
      azure: azure_hostname_style is set to 'os'
      fargate: agent is not runnning on Fargate
      fqdn: FQDN hostname is not usable
      gce: unable to retrieve hostname from GCE: GCE metadata API error: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname
      os: OS hostname is not usable

  Metadata
  ========
    agent_version: 7.50.3
    config_apm_dd_url:
    config_dd_url:
    config_logs_dd_url:
    config_logs_socks5_proxy_address:
    config_no_proxy: [169.254.169.254 100.100.100.200]
    config_process_dd_url:
    config_proxy_http:
    config_proxy_https:
    config_site:
    feature_apm_enabled: false
    feature_container_images_enabled: true
    feature_csm_vm_containers_enabled: false
    feature_csm_vm_hosts_enabled: false
    feature_cspm_enabled: false
    feature_cws_enabled: false
    feature_cws_network_enabled: true
    feature_cws_remote_config_enabled: false
    feature_cws_security_profiles_enabled: false
    feature_dynamic_instrumentation_enabled: false
    feature_fips_enabled: false
    feature_imdsv2_enabled: false
    feature_logs_enabled: false
    feature_networks_enabled: true
    feature_networks_http_enabled: false
    feature_networks_https_enabled: false
    feature_oom_kill_enabled: false
    feature_otlp_enabled: false
    feature_process_enabled: false
    feature_process_language_detection_enabled: false
    feature_processes_container_enabled: true
    feature_remote_configuration_enabled: true
    feature_tcp_queue_length_enabled: false
    feature_usm_enabled: false
    feature_usm_go_tls_enabled: false
    feature_usm_http2_enabled: false
    feature_usm_http_by_status_code_enabled: false
    feature_usm_istio_enabled: false
    feature_usm_java_tls_enabled: false
    feature_usm_kafka_enabled: false
    feature_windows_crash_detection_enabled: false
    flavor: agent
    hostname_source: aws
    install_method_installer_version: datadog-3.53.0
    install_method_tool: helm
    install_method_tool_version: Helm
    system_probe_core_enabled: true
    system_probe_gateway_lookup_enabled: true
    system_probe_kernel_headers_download_enabled: false
    system_probe_max_connections_per_message: 600
    system_probe_prebuilt_fallback_enabled: true
    system_probe_protocol_classification_enabled: true
    system_probe_root_namespace_enabled: true
    system_probe_runtime_compilation_enabled: false
    system_probe_telemetry_enabled: true
    system_probe_track_tcp_4_connections: true
    system_probe_track_tcp_6_connections: true
    system_probe_track_udp_4_connections: true
    system_probe_track_udp_6_connections: true

=========
Collector
=========

  Running Checks
  ==============

    container
    ---------
      Instance ID: container [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/container.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 1,960, Total: 31,325,764
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 42ms
      Last Execution Date : 2024-04-05 08:19:19 UTC (1712305159000)
      Last Successful Execution Date : 2024-04-05 08:19:19 UTC (1712305159000)


    containerd
    ----------
      Instance ID: containerd [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/containerd.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 146, Total: 1,701,895
      Events: Last Run: 1, Total: 7,285
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 105ms
      Last Execution Date : 2024-04-05 08:19:11 UTC (1712305151000)
      Last Successful Execution Date : 2024-04-05 08:19:11 UTC (1712305151000)


    cpu
    ---
      Instance ID: cpu [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 9, Total: 138,179
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:18 UTC (1712305158000)
      Last Successful Execution Date : 2024-04-05 08:19:18 UTC (1712305158000)


    cri
    ---
      Instance ID: cri [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/cri.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 38ms
      Last Execution Date : 2024-04-05 08:19:10 UTC (1712305150000)
      Last Successful Execution Date : 2024-04-05 08:19:10 UTC (1712305150000)


    disk (5.0.0)
    ------------
      Instance ID: disk:67cc0574430a16ba [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 742, Total: 12,381,558
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 71ms
      Last Execution Date : 2024-04-05 08:19:17 UTC (1712305157000)
      Last Successful Execution Date : 2024-04-05 08:19:17 UTC (1712305157000)


    file_handle
    -----------
      Instance ID: file_handle [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 5, Total: 76,765
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:09 UTC (1712305149000)
      Last Successful Execution Date : 2024-04-05 08:19:09 UTC (1712305149000)


    io
    --
      Instance ID: io [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 41, Total: 629,487
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:16 UTC (1712305156000)
      Last Successful Execution Date : 2024-04-05 08:19:16 UTC (1712305156000)


    istio (5.2.0)
    -------------
      Instance ID: istio:219e02e5b53b2260 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 8,742
      Metric Samples: Last Run: 1,166, Total: 10,193,172
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 255ms
      Last Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)
      Last Successful Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)

      Instance ID: istio:4744d8d41200c4a0 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 1,644
      Metric Samples: Last Run: 2,182, Total: 3,425,435
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 219ms
      Last Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)
      Last Successful Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)

      Instance ID: istio:70df63184cf82d2 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 8,288
      Metric Samples: Last Run: 1,798, Total: 14,901,204
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 345ms
      Last Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)
      Last Successful Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)

      Instance ID: istio:74a8e1af32ad8ebf [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 1,648
      Metric Samples: Last Run: 2,566, Total: 3,987,890
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 257ms
      Last Execution Date : 2024-04-05 08:19:16 UTC (1712305156000)
      Last Successful Execution Date : 2024-04-05 08:19:16 UTC (1712305156000)

      Instance ID: istio:d2143dd4a37a4e7b [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 364
      Metric Samples: Last Run: 1,542, Total: 552,476
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 311ms
      Last Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)
      Last Successful Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)

      Instance ID: istio:d28fd20277d333e9 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 1,656
      Metric Samples: Last Run: 2,574, Total: 3,982,556
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 259ms
      Last Execution Date : 2024-04-05 08:19:19 UTC (1712305159000)
      Last Successful Execution Date : 2024-04-05 08:19:19 UTC (1712305159000)

      Instance ID: istio:e17eac6a70aa287c [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 1,652
      Metric Samples: Last Run: 266, Total: 439,197
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 29ms
      Last Execution Date : 2024-04-05 08:19:17 UTC (1712305157000)
      Last Successful Execution Date : 2024-04-05 08:19:17 UTC (1712305157000)

      Instance ID: istio:e8792afacd9d35f6 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 360
      Metric Samples: Last Run: 2,438, Total: 797,701
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 243ms
      Last Execution Date : 2024-04-05 08:19:11 UTC (1712305151000)
      Last Successful Execution Date : 2024-04-05 08:19:11 UTC (1712305151000)

      Instance ID: istio:f860a46d1e852beb [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/istio.d/auto_conf.yaml
      Total Runs: 1,640
      Metric Samples: Last Run: 1,542, Total: 2,525,445
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 307ms
      Last Execution Date : 2024-04-05 08:19:14 UTC (1712305154000)
      Last Successful Execution Date : 2024-04-05 08:19:14 UTC (1712305154000)


    kube_proxy (6.1.1)
    ------------------
      Instance ID: kube_proxy:18f060562e1fbff3 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kube_proxy.yaml
      Total Runs: 15,354
      Metric Samples: Last Run: 73, Total: 1,114,329
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 2, Total: 30,708
      Average Execution Time : 51ms
      Last Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)
      Last Successful Execution Date : 2024-04-05 08:19:12 UTC (1712305152000)


    kubelet (7.10.1)
    ----------------
      Instance ID: kubelet:2b9bec749170d31d [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
      Total Runs: 11,438
      Metric Samples: Last Run: 2,067, Total: 24,025,674
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 5, Total: 57,056
      Average Execution Time : 540ms
      Last Execution Date : 2024-04-05 08:19:18 UTC (1712305158000)
      Last Successful Execution Date : 2024-04-05 08:19:18 UTC (1712305158000)


    load
    ----
      Instance ID: load [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 6, Total: 92,118
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:08 UTC (1712305148000)
      Last Successful Execution Date : 2024-04-05 08:19:08 UTC (1712305148000)


    memory
    ------
      Instance ID: memory [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 20, Total: 307,080
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)
      Last Successful Execution Date : 2024-04-05 08:19:15 UTC (1712305155000)


    network (3.0.0)
    ---------------
      Instance ID: network:4b0649b7e11f0772 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 270, Total: 4,254,102
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 4ms
      Last Execution Date : 2024-04-05 08:19:07 UTC (1712305147000)
      Last Successful Execution Date : 2024-04-05 08:19:07 UTC (1712305147000)


    ntp
    ---
      Instance ID: ntp:3c427a42a70bbf8 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
      Total Runs: 256
      Metric Samples: Last Run: 1, Total: 256
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 256
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:05:57 UTC (1712304357000)
      Last Successful Execution Date : 2024-04-05 08:05:57 UTC (1712304357000)


    orchestrator_pod
    ----------------
      Instance ID: orchestrator_pod:888ebc42a3817b00 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/orchestrator_pod.d/conf.yaml.default
      Total Runs: 15,354
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:14 UTC (1712305154000)
      Last Successful Execution Date : 2024-04-05 08:19:14 UTC (1712305154000)


    uptime
    ------
      Instance ID: uptime [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
      Total Runs: 15,353
      Metric Samples: Last Run: 1, Total: 15,353
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2024-04-05 08:19:06 UTC (1712305146000)
      Last Successful Execution Date : 2024-04-05 08:19:06 UTC (1712305146000)

========
JMXFetch
========

  Information
  ==================
  Initialized checks
  ==================
    no checks

  Failed checks
  =============
    no checks

=========
Forwarder
=========

  Transactions
  ============
    Cluster: 0
    ClusterRole: 0
    ClusterRoleBinding: 0
    CronJob: 0
    CustomResource: 0
    CustomResourceDefinition: 0
    DaemonSet: 0
    Deployment: 0
    Dropped: 529
    HighPriorityQueueFull: 16
    HorizontalPodAutoscaler: 0
    Ingress: 0
    Job: 0
    Namespace: 0
    Node: 0
    OrchestratorManifest: 0
    PersistentVolume: 0
    PersistentVolumeClaim: 0
    Pod: 0
    ReplicaSet: 0
    Requeued: 21,934
    Retried: 931
    RetryQueueSize: 0
    Role: 0
    RoleBinding: 0
    Service: 0
    ServiceAccount: 0
    StatefulSet: 0
    VerticalPodAutoscaler: 0

  Transaction Successes
  =====================
    Total number: 109483
    Successes By Endpoint:
      check_run_v1: 15,272
      intake: 2,057
      metadata_v1: 830
      series_v2: 76,052
      sketches_v2: 15,272

  Transaction Errors
  ==================
    Total number: 96
    Errors By Type:
      ConnectionErrors: 1,504

  On-disk storage
  ===============
    On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.

  API Keys status
  ===============
    API key ending with 401b3: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.com - API Key ending with:
      - 401b3

==========
Logs Agent
==========

  Logs Agent is not running



============
System Probe
============
  Status: Running
  Uptime: 63h58m15.001056932s
  Last Updated: 2024-04-05 08:19:13 UTC (1712305153000)

  USM
  ===
    Status: <no value>

  NPM
  ===
    Status: Running
    Last Check: 2024-04-05 08:18:59 UTC (1712305139000)


=============
Process Agent
=============

  Version: 7.50.3
  Status date: 2024-04-05 08:19:20.892 UTC (1712305160892)
  Process Agent Start: 2024-04-02 16:20:55.453 UTC (1712074855453)
  Pid: 25154
  Go Version: go1.20.12
  Build arch: amd64
  Log Level: INFO
  Enabled Checks: [process_discovery pod connections rtcontainer container]
  Allocated Memory: 27,559,312 bytes
  Hostname: i-02e5411f76cd87b64
  System Probe Process Module Status: Not running
  Process Language Detection Enabled: False

  =================
  Process Endpoints
  =================
    https://process.datadoghq.com - API Key ending with:
        - 401b3

  =========
  Collector
  =========
    Last collection time: 2024-04-05 08:19:19
    Docker socket:
    Number of processes: 0
    Number of containers: 59
    Process Queue length: 0
    RTProcess Queue length: 0
    Connections Queue length: 0
    Event Queue length: 0
    Pod Queue length: 0
    Process Bytes enqueued: 0
    RTProcess Bytes enqueued: 0
    Connections Bytes enqueued: 0
    Event Bytes enqueued: 0
    Pod Bytes enqueued: 0
    Drop Check Payloads: []

  ==========
  Extractors
  ==========

    Workloadmeta
    ============
      Cache size: 0
      Stale diffs discarded: 0
      Diffs dropped: 0

=========
APM Agent
=========
  Status: Running
  Pid: 25035
  Uptime: 230305 seconds
  Mem alloc: 26,282,472 bytes
  Hostname: i-02e5411f76cd87b64
  Receiver: 0.0.0.0:8126
  Endpoints:
    https://trace.agent.datadoghq.com

  Receiver (previous minute)
  ==========================
    From go 1.20.3 (gc-amd64-linux), client v1.50.0
      Traces received: 38328 (37,911,802 bytes)
      Spans received: 83140

    From cpp 201402 (), client v1.2.1
      Traces received: 40430 (44,681,902 bytes)
      Spans received: 43621

    From python 3.9.18 (CPython), client 0.29.0
      Traces received: 17 (49,266 bytes)
      Spans received: 221

    From ruby 3.0.5 (ruby-x86_64-linux), client 1.13.1
      Traces received: 28009 (593,635,817 bytes)
      Spans received: 1.003521e+06


    Priority sampling rate for 'service:collector.newrelic.com,env:production': 19.3%
    Priority sampling rate for 'service:event-publisher,env:production': 0.0%
    Priority sampling rate for 'service:grpc-internal,env:production': 57.9%
    Priority sampling rate for 'service:grpc-internal.production,env:production': 57.9%
    Priority sampling rate for 'service:mysql2,env:production': 100.0%
    Priority sampling rate for 'service:orchard-web,env:production': 100.0%
    Priority sampling rate for 'service:redis,env:production': 100.0%
    Priority sampling rate for 'service:sentry.io,env:production': 38.6%
    Priority sampling rate for 'service:sidekiq,env:production': 1.3%
    Priority sampling rate for 'service:sidekiq-iot,env:production': 0.8%
    Priority sampling rate for 'service:sidekiq-makara_mysql2rgeo,env:production': 77.2%
    Priority sampling rate for 'service:sidekiq-rider,env:production': 0.2%
    Priority sampling rate for 'service:sidekiq-streaming,env:production': 0.6%
    Priority sampling rate for 'service:sidekiq-supply,env:production': 1.6%
    Priority sampling rate for 'service:statsigapi.net,env:production': 9.6%
    Priority sampling rate for 'service:tag-server-go,env:production': 0.1%
    Priority sampling rate for 'service:tag-server-go.production,env:production': 0.3%
    Priority sampling rate for 'service:web-external-api,env:production': 77.2%
    Priority sampling rate for 'service:web-external-api.production,env:production': 77.2%
    Priority sampling rate for 'service:web-message,env:production': 57.9%
    Priority sampling rate for 'service:web-message.production,env:production': 57.9%
    Priority sampling rate for 'service:web-rider,env:production': 100.0%
    Priority sampling rate for 'service:web-rider.production,env:production': 46.3%

  Writer (previous minute)
  ========================
    Traces: 0 payloads, 0 traces, 0 events, 0 bytes
    Stats: 0 payloads, 0 stats buckets, 0 bytes

==========
Aggregator
==========
  Checks Metric Sample: 393,955,561
  Dogstatsd Metric Sample: 1,652,553,399
  Event: 7,287
  Events Flushed: 7,287
  Number Of Flushes: 15,353
  Series Flushed: 458,465,860
  Service Check: 103,395
  Service Checks Flushed: 118,740
  Sketches Flushed: 9,250,184
  container-images: 3,181
  container-lifecycle: 658

=========
DogStatsD
=========
  Event Packets: 1
  Event Parse Errors: 0
  Metric Packets: 1,652,553,398
  Metric Parse Errors: 0
  Service Check Packets: 0
  Service Check Parse Errors: 0
  Udp Bytes: 28,853,685,311
  Udp Packet Reading Errors: 0
  Udp Packets: 49,667,867
  Uds Bytes: 193,026,699,655
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 1,258,203,539
  Unterminated Metric Errors: 0

Tip: For troubleshooting, enable 'dogstatsd_metrics_stats_enable' in the main datadog.yaml file to generate Dogstatsd logs. Once 'dogstatsd_metrics_stats_enable' is enabled, users can also use 'dogstatsd-stats' command to get visibility of the latest collected metrics.

=====================
Datadog Cluster Agent
=====================

  - Datadog Cluster Agent endpoint detected: https://10.100.25.167:5005
  Successfully connected to the Datadog Cluster Agent.
  - Running: 7.50.3+commit.abce0cb

=============
Autodiscovery
=============
  Enabled Features
  ================
    containerd
    cri
    kubernetes
    orchestratorexplorer

====================
Remote Configuration
====================


    Organization enabled: False
    API Key: Not authorized, add the Remote Configuration Read permission to enable it for this agent.
    Last error: None



====
OTLP
====

  Status: Not enabled
  Collector status: Not running

Additional environment details (Operating System, Cloud provider, etc):
EKS on AWS

Steps to reproduce the issue:

  1. Enable DBM on RDS MySQL 8 with primary and few replicas

Describe the results you received:
Replica active connection graph is flooded with BEGIN and wait/synch/cond/sql/Worker_info::jobs_cond
image

When looking at the query in mysql/datadog_checks/mysql/activity.py, it does seem like the command is Query but looks like the thread is sleeping https://github.com/mysql/mysql-server/blob/mysql-cluster-8.0.36/sql/rpl_rli_pdb.cc#L2345C6-L2345C22

RDS Perf insights for the same db looks like this
image

They seem to be correctly filtering that out.

Describe the results you expected:
I believe, it should filter out such events from active connections graph.

Additional information you deem important (e.g. issue happens only occasionally):

@rahul342 rahul342 changed the title AWS RDS MySQL 8 replica active connections dominated by replica workers waiting for events [DBM] AWS RDS MySQL 8 replica active connections dominated by replica workers waiting for events Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant