-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support training for latency based anomalies specifically for perf-anomaly #409
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #409 +/- ##
==========================================
- Coverage 93.07% 92.17% -0.90%
==========================================
Files 97 97
Lines 4492 4781 +289
Branches 387 430 +43
==========================================
+ Hits 4181 4407 +226
- Misses 231 276 +45
- Partials 80 98 +18 ☔ View full report in Codecov by Sentry. |
Please replace "Explain what this PR does." with the real description & purpose of this PR. |
if not self.post_aggregations: | ||
self.post_aggregations = { | ||
"p90": _post_agg.QuantilesDoublesSketchToQuantile( | ||
output_name="agg_out", field=postaggregator.Field("agg_out"), fraction=0.90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a library feature, percentile is better to be configurable.
|
||
if not self.aggregations: | ||
self.aggregations = {"count": doublesum("count")} | ||
self.aggregations = { | ||
"agg_out": _agg.quantiles_doubles_sketch("valuesDoublesSketch", "agg0", 64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this value 64 imply? Does it need to be configurable for different latency anomaly use cases?
@@ -3,9 +3,9 @@ | |||
#################################################################################################### | |||
|
|||
ARG PYTHON_VERSION=3.11 | |||
FROM python:${PYTHON_VERSION}-slim-bookworm AS builder | |||
FROM python:${PYTHON_VERSION}-bookworm AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -slim-bookworm
here replaced with -bookworm
. However, the same -slim-bookworm
at line 31 remains unchanged. Why is the difference?
Explain what this PR does.
This PR supports Anomaly Detection on fields that use valuesDoubleSketches. We add aggregations and postaggregations which run natively on druid. These sketches are converted to values using these postaggregations and are run on druid.
This would enable us to use anomaly detection for inputs using sketches(https://datasketches.apache.org/). For example, latency based anomaly.
Also, I have made few changes to DockerFile and added a patch for Numalogic 0.9.1 to avoid CVE issues. This is important for the perf-anomaly team to avoid moving to Numaflow 1.2.1 and updating all the UDFs and UDSinks. This would help them save lot of time by just upgrading the ML vertices.