Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring ElastiCache redis is broken #571

Open
michaelwittig opened this issue Aug 6, 2021 · 0 comments
Open

Monitoring ElastiCache redis is broken #571

michaelwittig opened this issue Aug 6, 2021 · 0 comments
Assignees
Labels

Comments

@michaelwittig
Copy link
Contributor

Monitoring ElastiCache redis is not possible in an elegant way at the moment in CloudFormation. The CacheClusterId CloudWatch metric dimension is constructed in a way that prevents us from creating alarms for the relevant metrics.

  • For cluster mode disabled replication groups (NumShards = 1), the following CacheClusterIds are used: ${ReplicationGroup}-NNN (NNN e.g., 001, 002, ..., 006 for each replica)
  • For cluster mode enabled replication groups (NumShards > 1), the following CacheClusterIds are used:
    ${ReplicationGroup}-MMMM-NNN (MMMM e.g., 00001, 0002, ... for each node group/shard id)

The following alarm would solve the issue. Unfortunately, search expressions are not supported in Alarms yet...

CPUUtilizationTooHighAlarm:
   Type: 'AWS::CloudWatch::Alarm'
   Properties:
     AlarmDescription: !Sub 'Average CPU utilization over last 10 minutes higher than ${CPUUtilizationThreshold}%'
     ComparisonOperator: GreaterThanThreshold
     EvaluationPeriods: 1
     Metrics:
     - Expression: !Sub 'SEARCH(''{AWS/ElastiCache, CacheClusterId} ${ReplicationGroup} "CPUUtilization"'', ''Average'', 600)'
       Id: 'e1'
       Label: 'e1'
       ReturnData: false
     - Expression: 'MAX(e1)'
       Id: 'e2'
       Label: 'e2'
       ReturnData: true
     Threshold: !Ref CPUUtilizationThreshold
     AlarmActions:
     - 'Fn::ImportValue': !Sub '${ParentAlertStack}-TopicARN'

We don't have loops in CloudFormation either.

Since we allow up to 250 Shards with up to 5 replicas each we would need too many conditions and it would bloat the template in a massive way.

Not sure what we can do here...

@michaelwittig michaelwittig self-assigned this Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant