-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cloudwatch): support the creation of CloudWatch anomaly detection alarm #31232
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.
A comment requesting an exemption should contain the text Exemption Request
. Additionally, if clarification is needed add Clarification Request
to a comment.
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
if (isAnomalyDetection) { | ||
// Ensure returnData is set to true for both metrics in anomaly detection | ||
const cfnMetrics = alarm.metrics as any[]; | ||
if (cfnMetrics && cfnMetrics.length >= 2) { | ||
cfnMetrics[0].returnData = true; | ||
cfnMetrics[1].returnData = true; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is cfnMetrics
used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only in this block. To make sure the returnData
property is set to true. TBH, the documentation is not super clear about which one should have the property set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe provide the documentation in the comment? This code seems weird and providing the doc in the comment for future reference would help.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
To create an alarm based on anomaly detection, you can use the `createAnomalyDetectionAlarm` method on a `Metric` or `MathExpression` object: | ||
|
||
```ts | ||
declare const queue: sqs.Queue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: customers may not get why using a queue here. Can we use the example in the integration test?
const metric = new Metric({
namespace: 'AWS/EC2',
metricName: 'CPUUtilization',
statistic: 'Average',
period: Duration.minutes(5),
});
metric.createAnomalyDetectionAlarm(this, 'AnomalyAlarm', {
bounds: 2,
evaluationPeriods: 3,
datapointsToAlarm: 2,
comparisonOperator: ComparisonOperator.LESS_THAN_LOWER_OR_GREATER_THAN_UPPER_THRESHOLD,
});
throw new Error(`Invalid comparison operator for anomaly detection alarm: ${props.comparisonOperator}`); | ||
} | ||
const anomalyDetectionExpression = new MathExpression({ | ||
expression: `ANOMALY_DETECTION_BAND(m0, ${props.bounds ?? 1})`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since bounds
is a required property, do we need null check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is useless now that the property is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we have a default value for it?
@@ -190,7 +213,8 @@ export class Alarm extends AlarmBase { | |||
|
|||
// Evaluation | |||
comparisonOperator, | |||
threshold: props.threshold, | |||
threshold: isAnomalyDetection ? undefined : props.threshold, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if it's anomaly detection and threshold value is passed in? Does it ignore the value or throw an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the comparison operator is for an anomaly detection alarm, there will be an error when CloudFormation will try to execute the template. Because providing a threshold for anomaly detection is not valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this validation on CDK side to preemptively throw?
General question: There's a |
@@ -190,7 +213,8 @@ export class Alarm extends AlarmBase { | |||
|
|||
// Evaluation | |||
comparisonOperator, | |||
threshold: props.threshold, | |||
threshold: isAnomalyDetection ? undefined : props.threshold, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this validation on CDK side to preemptively throw?
@@ -206,12 +230,22 @@ export class Alarm extends AlarmBase { | |||
...metricProps, | |||
}); | |||
|
|||
if (isAnomalyDetection) { | |||
// Ensure returnData is set to true for both metrics in anomaly detection | |||
const cfnMetrics = alarm.metrics as any[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make it stronger type instead or any
?
if (isAnomalyDetection) { | ||
// Ensure returnData is set to true for both metrics in anomaly detection | ||
const cfnMetrics = alarm.metrics as any[]; | ||
if (cfnMetrics && cfnMetrics.length >= 2) { | ||
cfnMetrics[0].returnData = true; | ||
cfnMetrics[1].returnData = true; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe provide the documentation in the comment? This code seems weird and providing the doc in the comment for future reference would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API-wise, it seems that to create an Anomaly detection alarm, 3 things need to happen:
- You must set
thresholdMetricId
- You must wrap metric in a MetricMath that uses
ANOMALY_DETECTION_BAND
- You must not set a
threshold
- For an operator you must pick one of the special 3 comparison operators.
That feels like a lot of work, no?
/** | ||
* In an alarm based on an anomaly detection model, this is the ID of the ANOMALY_DETECTION_BAND function used as the threshold for the alarm. | ||
* | ||
* @default - No metric id. | ||
*/ | ||
readonly thresholdMetricId?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really expose or rely on people specifying metric IDs anywhere else.
Why don't we expect them to pass in a MetricMath
object here, which must of the kind ANOMALY_DETECTION_BAND
, and we render it into the metrics[]
array and render its id correctly to the underlying L1 property?
(Or in fact if not given we could create the ANOMALY_DETECTION_BAND()
automatically with a sensible stdDev like in createAnomalyDetectionAlarm
)
@@ -190,7 +213,8 @@ export class Alarm extends AlarmBase { | |||
|
|||
// Evaluation | |||
comparisonOperator, | |||
threshold: props.threshold, | |||
threshold: isAnomalyDetection ? undefined : props.threshold, | |||
thresholdMetricId: isAnomalyDetection ? (props.thresholdMetricId || 'expr_1') : undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How likely is expr_1
to be correct?
cfnMetrics[0].returnData = true; | ||
cfnMetrics[1].returnData = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the logic here, in that we have at least 2 metrics (but could be 5 or 10), and we switch the first 2 to returnData
? How do we know these are the right ones? Which ones are you trying to target?
return new Alarm(scope, id, { | ||
...props, | ||
metric: anomalyDetectionExpression, | ||
threshold: 0, // This value will be ignored for anomaly detection alarms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we make threshold
optional then, and do a runtime validation for its presence?
* - The minimum value should be greater than 0. A value of 0 or negative values would not make sense in the context of calculating standard deviations. | ||
* - There is no strict maximum value defined, as standard deviations can theoretically extend infinitely. However, in practice, values beyond 5 or 6 standard deviations are rarely used, as they would result in an extremely wide anomaly detection band, potentially missing significant anomalies. | ||
*/ | ||
readonly bounds: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we call this stdDevs
then?
/** | ||
* Comparison to use to check if metric is breaching | ||
*/ | ||
readonly comparisonOperator: ComparisonOperator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "within bounds" operator would make a nice default here.
* @param metric The metric to create the alarm for | ||
* @returns The newly created Alarm | ||
*/ | ||
public static createAnomalyDetectionAlarmFromMetric(scope: Construct, id: string, props: CreateAnomalyDetectionAlarmProps, metric: IMetric): Alarm { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that this function is intended as a convenience function -- the main interface for users is still new Alarm()
, and so that API should still be ergonomic.
Don't make design decisions that make it easy to implement this function, but harder to use the new Alarm()
API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous review
Issue #10540
Closes #10540.
Reason for this change
This change add the possibility to create an anomaly detection alarm based on a
Metric
orMathExpression
object.Description of changes
Creation of a helper function in
Metric
orMathExpression
classes which will wrap the actual metric into an anomaly detection metric expression before creating the CloudWatch alarm.This PR also include a refactor of the
CreateAlarmOptions
class in order to reflect the incompatibilities between standard alarm and anomaly detection ones.Description of how you validated changes
Executed the new integration test (
packages/@aws-cdk-testing/framework-integ/test/aws-cloudwatch/test/integ.anomaly-detection-alarm.ts
) on a test account.Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license