Validation API Backend for AD everywhere #221

amitgalitz · 2020-09-02T01:04:57Z

Description of changes:
Added a new API endpoint to AD in order to validate anomaly detector configurations and return back if any failures or suggestions occurred. The validation API takes an input of detector configurations and then validates it against both general checks and against the data source to know if the configurations will likely lead to successful anomaly detector creation.
Endpoint: Post _opendistro/_anomaly_detection/detectors/_validate
Input : Anomaly Detector configurations
image

Overall the validation checks if any fields are missing, if any data can be found with the given filter query and feature query. I also check whether the feature aggregation type is valid for the field type. The last two steps of the validation, checks if the detector interval used will produce data that is dense enough, recommending a different interval if this isn't the case in milliseconds. The API also checks for the last seen data point and returns it as the window delay recommendation.

Example input for detector interval that is too short :

{
    "name":"test-detector-02",
    "description":"Test detector",
    "time_field":"timestamp",
    "indices":[
        "test-index-sparse-3-demo"
    ],
    "feature_attributes":[
        {
            "feature_name":"total_order",
            "feature_enabled":true,
            "aggregation_query":{
                "total_order":{
                    "max":{
                        "field":"field-1"
                    }
                }
            }
        }
    ],
    "detection_interval":{
        "period":{
            "interval":10,
            "unit":"Minutes"
        }
    },
    "window_delay":{
        "period":{
            "interval":150,
            "unit":"Minutes"
        }
    }
}

Output:

{
    "failures":{

    },
    "suggestedChanges":{
        "detection_interval":[
            "4320000" *90 minutes recommendation
        ]
    }
}

Testing Done:
Integration testing added
manually tested that API works with both dense and sparse sample data and some of the tms-issue data.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

… calling randomsampling once internally

… instead

Newbranch validate 1.9

codecov · 2020-09-02T01:09:35Z

Codecov Report

Merging #221 into master will decrease coverage by 4.07%.
The diff coverage is 22.59%.

@@             Coverage Diff              @@
##             master     #221      +/-   ##
============================================
- Coverage     72.41%   68.34%   -4.08%     
- Complexity     1290     1324      +34     
============================================
  Files           139      145       +6     
  Lines          6073     6593     +520     
  Branches        469      518      +49     
============================================
+ Hits           4398     4506     +108     
- Misses         1464     1874     +410     
- Partials        211      213       +2

Flag	Coverage Δ	Complexity Δ
#cli	`80.30% <ø> (ø)`	`0.00 <ø> (ø)`
#plugin	`66.88% <22.59%> (-4.48%)`	`1324.00 <42.00> (+34.00)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ	Complexity Δ
...distroforelasticsearch/ad/model/DateTimeRange.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (?)`
.../handler/ValidateAnomalyDetectorActionHandler.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (?)`
...rch/ad/transport/AnomalyResultTransportAction.java	`78.46% <ø> (ø)`	`59.00 <0.00> (ø)`
...stroforelasticsearch/ad/util/RestHandlerUtils.java	`100.00% <ø> (ø)`	`13.00 <0.00> (ø)`
...rch/ad/rest/RestValidateAnomalyDetectorAction.java	`63.63% <63.63%> (ø)`	`3.00 <3.00> (?)`
...stroforelasticsearch/ad/model/AnomalyDetector.java	`88.17% <88.73%> (-0.72%)`	`63.00 <25.00> (+17.00)`	⬇️
...troforelasticsearch/ad/model/ValidateResponse.java	`95.65% <95.65%> (ø)`	`8.00 <8.00> (?)`
...stroforelasticsearch/ad/AnomalyDetectorPlugin.java	`93.61% <100.00%> (+0.06%)`	`10.00 <0.00> (ø)`
...oforelasticsearch/ad/model/ValidationFailures.java	`100.00% <100.00%> (ø)`	`3.00 <3.00> (?)`
...ticsearch/ad/model/ValidationSuggestedChanges.java	`100.00% <100.00%> (ø)`	`3.00 <3.00> (?)`
... and 7 more

kaituo · 2020-09-03T19:57:17Z

Unit test coverage drops and cause workflow to fail. Is it possible to write unit tests to cover it? Or can you let the workflow count your IT tests to the new coverage?

more unit tests added for models

kaituo · 2020-09-03T20:02:15Z

src/main/java/com/amazon/opendistroforelasticsearch/ad/model/AnomalyDetector.java

+            uiMetadata,
+            schemaVersion,
+            lastUpdateTime,
+            true


Is this the only place that the method is different from the parse method?

kaituo · 2020-09-03T21:49:26Z

.../amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java

+    protected static final String AGG_NAME_MAX = "max_timefield";
+    protected static final int NUM_OF_INTERVAL_SAMPLES = 128;
+    protected static final int MAX_NUM_OF_SAMPLES_VIEWED = 128;
+    protected static final int NUM_OF_INTERVALS_CHECKED = 256;
+    protected static final double SAMPLE_SUCCESS_RATE = 0.75;
+    protected static final int FEATURE_VALIDATION_TIME_BACK_MINUTES = 10080;
+    protected static final int NUM_OF_INTERVALS_CHECKED_FILTER = 384;
+    protected static final long MAX_INTERVAL_LENGTH = (30L * 24 * 60 * 60 * 1000);
+    protected static final long HISTORICAL_CHECK_IN_MS = (90L * 24 * 60 * 60 * 1000);
+    protected static final String NAME_REGEX = "[a-zA-Z0-9._-]+";
+    protected static final double INTERVAL_RECOMMENDATION_MULTIPLIER = 1.2;


Could you add documentation on how the numbers are determined?

kaituo · 2020-09-03T21:58:49Z

.../amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java

+        }
+        String error = RestHandlerUtils.validateAnomalyDetector(anomalyDetector, maxAnomalyFeatures);
+        if (StringUtils.isNotBlank(error)) {
+            List<String> dupErrorsFeatures = new ArrayList<>();


The error returned by RestHandlerUtils.validateAnomalyDetector is not just about duplicate feature. Can we rename related variable and enum?

kaituo · 2020-09-03T22:37:08Z

.../amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java

+
+    private void checkADNameExists() {
+        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
+        boolQueryBuilder.must(QueryBuilders.termQuery("name.keyword", anomalyDetector.getName()));


We have used the constant "name.keyword" in src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/IndexAnomalyDetectorActionHandler.java as well. Can you declare a variable in sth like com.amazon.opendistroforelasticsearch.ad.constant.CommonName and reference it?

kaituo · 2020-09-03T22:39:25Z

.../amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java

+            .aggregation(AggregationBuilders.max(AGG_NAME_MAX).field(anomalyDetector.getTimeField()))
+            .size(1)
+            .sort(new FieldSortBuilder(anomalyDetector.getTimeField()).order(SortOrder.DESC));
+        SearchRequest searchRequest = new SearchRequest().indices(anomalyDetector.getIndices().get(0)).source(searchSourceBuilder);


What if we have multiple indices?

kaituo · 2020-09-04T03:58:59Z

src/test/java/com/amazon/opendistroforelasticsearch/ad/e2e/DetectionResultEvalutationIT.java

+        Map<String, List<Map<String, ?>>> suggestionsMap = (Map<String, List<Map<String, ?>>>) XContentMapValues
+            .extractValue("suggestedChanges", responseMap);
+        assertTrue(failuresMap.keySet().size() == 1);
+        assertTrue(failuresMap.containsKey("others"));


Can we also check the value?

kaituo · 2020-09-04T03:59:54Z

src/test/java/com/amazon/opendistroforelasticsearch/ad/e2e/DetectionResultEvalutationIT.java

+            .extractValue("suggestedChanges", responseMap);
+        assertTrue(failuresMap.keySet().size() == 0);
+        assertTrue(suggestionsMap.keySet().size() == 1);
+        assertTrue(suggestionsMap.containsKey("detection_interval"));


Can we also check the value?

kaituo · 2020-09-04T04:07:27Z

src/test/java/com/amazon/opendistroforelasticsearch/ad/e2e/DetectionResultEvalutationIT.java

+            .extractValue("suggestedChanges", responseMap);
+        assertTrue(failuresMap.keySet().size() == 0);
+        assertTrue(suggestionsMap.keySet().size() == 1);
+        assertTrue(suggestionsMap.containsKey("feature_attributes"));


Can we check the value?

kaituo · 2020-09-04T04:30:14Z

src/main/java/com/amazon/opendistroforelasticsearch/ad/model/ValidateResponse.java

+        XContentBuilder xContentBuilder = builder.startObject();
+
+        xContentBuilder.startObject("failures");
+        for (String key : failures.keySet()) {


What if failures or suggestedChanges are null?

kaituo · 2020-09-04T04:34:11Z

.../amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java

+                    .xContent()
+                    .createParser(xContent, LoggingDeprecationHandler.INSTANCE, feature.getAggregation().toString());
+                parser.nextToken();
+                List<String> fieldNames = parseAggregationRequest(parser);


What if fieldNames's size is 0?

kaituo · 2020-09-04T17:38:02Z

Unit test coverage drops and cause workflow to fail. Is it possible to write unit tests to cover it? Or can you let the workflow count your IT tests to the new coverage?

src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java may require more unit tests.

Galitzky and others added 22 commits July 10, 2020 11:00

starting validate API structure

f9a097d

starting validate method

ef8195b

pre validation steps begining

faf0e73

working on feature aggregation

f8d6e88

works right now but random sampling is incosistent

ffa20dd

this solved the asynch issues with search request and this works with…

8dcf2eb

… calling randomsampling once internally

changed random sampling to nonrandom and no aggregation, filter query…

2bbcb58

… instead

commit before switching to updated code

7dd7851

merge conflict

a0a5e82

recomendation fully works now

6e46041

fixed feature query validation bug

890fa5b

ready for PR

7afabb4

fixed style issue for build

8ade35d

Merge branch 'master' into newbranch-validate-1.9

e66d6a2

fixed all changes from CR and added check for field type

2f79363

Merge branch 'master' into newbranch-validate-1.9

2370d5e

style fixes

4235849

added a datetimerange class plus some other fixes

a05c272

creating new series of timerange with the DateTimeRange class

4a523f8

creating new series of timerange with the DateTimeRange class

a2e56da

style fix

143e5d8

Merge pull request #1 from amitgalitz/newbranch-validate-1.9

3487c95

Newbranch validate 1.9

amitgalitz requested review from kaituo and ylwu-amzn September 2, 2020 01:04

amitgalitz requested a review from wnbts September 4, 2020 00:06

Galitzky and others added 3 commits September 3, 2020 17:32

more unit tests added for models

ef0bddf

style fix

e9fc63a

Merge pull request #2 from amitgalitz/newbranch-validate-1.9

80c3a3e

more unit tests added for models

kaituo reviewed Sep 4, 2020

View reviewed changes

Base automatically changed from master to main February 8, 2021 17:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation API Backend for AD everywhere #221

Validation API Backend for AD everywhere #221

amitgalitz commented Sep 2, 2020 •

edited by ylwu-amzn

codecov bot commented Sep 2, 2020 •

edited

kaituo commented Sep 3, 2020

kaituo Sep 3, 2020

kaituo Sep 3, 2020

kaituo Sep 3, 2020

kaituo Sep 3, 2020

kaituo Sep 3, 2020

kaituo Sep 4, 2020

kaituo Sep 4, 2020

kaituo Sep 4, 2020

kaituo Sep 4, 2020

kaituo Sep 4, 2020

kaituo commented Sep 4, 2020

Validation API Backend for AD everywhere #221

Are you sure you want to change the base?

Validation API Backend for AD everywhere #221

Conversation

amitgalitz commented Sep 2, 2020 • edited by ylwu-amzn

codecov bot commented Sep 2, 2020 • edited

Codecov Report

kaituo commented Sep 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaituo commented Sep 4, 2020

amitgalitz commented Sep 2, 2020 •

edited by ylwu-amzn

codecov bot commented Sep 2, 2020 •

edited