AWS CloudWatch allows you to raise alarms when certain values are above or below a given threshold. But what if you want the alarm only when it is between certain thresholds? That’s where metric math comes in.

A Simple Alarm

In CloudFormation, you can define an alarm quite easily:

Type: AWS::CloudWatch::Alarm
Properties:
  ActionsEnabled: true
  AlarmActions:
    - arn:aws:sns:eu-west-1:123456789:someSnsTopic
  AlarmDescription: "Some description"
  ComparisonOperator: GreaterThanOrEqualToThreshold
  DatapointsToAlarm: 1
  Dimensions:
    - Name: ApiName
      Value: "myApi"
  EvaluationPeriods: 1
  MetricName: Count
  Namespace: AWS/ApiGateway
  Period: 60
  Statistic: Sum
  Threshold: 5000
  TreatMissingData: notBreaching

This will raise an alarm when the amount of requests per minute to the given API goes over (or is equal to) 5000.

What if you want this alarm, but a different alarm with it goes over 10000? If you define the two alarms just like this, both alarms will go off when you have 10000 or more requests per minute. Because 10000 is also more than 5000.

Metric Math

The solution is to use metric math. With this you define an alarm with a list of metrics:

Type: AWS::CloudWatch::Alarm
Properties:
  ActionsEnabled: true
  AlarmActions:
    - arn:aws:sns:eu-west-1:123456789:someSnsTopic
  AlarmDescription: "Some description"
  ComparisonOperator: GreaterThanThreshold
  Threshold: 0
  DatapointsToAlarm: 1
  EvaluationPeriods: 1
  Metrics:
    - Id: "requests_over_5000"
      Label: "Requests per minute between 5000 and 10000"
      Expression: "IF(m1 >= 5000 AND m1 < 10000, 1, 0)" # reference the id of the metric below
      ReturnData: true # true for the one metric that you'll use in your alarm
    - Id: "m1"
      Label: "Invocation Count"
      MetricStat:
        Metric:
          Namespace: AWS/ApiGateway
          MetricName: Count
          Dimensions:
            - Name: ApiName
              Value: "myApi"
        Period: 60
        Stat: Sum
      ReturnData: false # false for any 'supporting' metric
  TreatMissingData: notBreaching

We define a metric for the request count just like in a normal alarm. But we put this in the Metrics list of the alarm. We also set ReturnData to false and give it an Id (only numbers, letters and underscores, and it should start with a lowercase).

Then we add another metric to the list. This time we set ReturnData to true because this is the metric that will be used to evaluate the alarm. Instead of giving it a MetricStat, we set an Expression.

If the count is between 5000 and 10000 we return 1 for this metric (requests_over_5000). We can reference the id’s of other metrics in our list. In our example, this is m1.

Back to the alarm itself. We define that the alarm should go off when the value is greater than 0. Based on our metric (requests_over_5000), the value will be either 0 (we’re under 5000 or over 10000) or 1 (we’re between the two values). So it’s sort of a boolean. If it’s 1, the alarm goes off, if it’s 0 it doesn’t.

As a last step, we can add a regular alarm like in our first example to trigger when we have more than 10000 requests.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.