Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Validation API Backend for AD everywhere #221

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

amitgalitz
Copy link
Contributor

@amitgalitz amitgalitz commented Sep 2, 2020

Description of changes:
Added a new API endpoint to AD in order to validate anomaly detector configurations and return back if any failures or suggestions occurred. The validation API takes an input of detector configurations and then validates it against both general checks and against the data source to know if the configurations will likely lead to successful anomaly detector creation.
Endpoint: Post _opendistro/_anomaly_detection/detectors/_validate
Input : Anomaly Detector configurations
image

Overall the validation checks if any fields are missing, if any data can be found with the given filter query and feature query. I also check whether the feature aggregation type is valid for the field type. The last two steps of the validation, checks if the detector interval used will produce data that is dense enough, recommending a different interval if this isn't the case in milliseconds. The API also checks for the last seen data point and returns it as the window delay recommendation.

Example input for detector interval that is too short :

{
    "name":"test-detector-02",
    "description":"Test detector",
    "time_field":"timestamp",
    "indices":[
        "test-index-sparse-3-demo"
    ],
    "feature_attributes":[
        {
            "feature_name":"total_order",
            "feature_enabled":true,
            "aggregation_query":{
                "total_order":{
                    "max":{
                        "field":"field-1"
                    }
                }
            }
        }
    ],
    "detection_interval":{
        "period":{
            "interval":10,
            "unit":"Minutes"
        }
    },
    "window_delay":{
        "period":{
            "interval":150,
            "unit":"Minutes"
        }
    }
}

Output:

{
    "failures":{

    },
    "suggestedChanges":{
        "detection_interval":[
            "4320000" *90 minutes recommendation
        ]
    }
}

Testing Done:
Integration testing added
manually tested that API works with both dense and sparse sample data and some of the tms-issue data.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov
Copy link

codecov bot commented Sep 2, 2020

Codecov Report

Merging #221 into master will decrease coverage by 4.07%.
The diff coverage is 22.59%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #221      +/-   ##
============================================
- Coverage     72.41%   68.34%   -4.08%     
- Complexity     1290     1324      +34     
============================================
  Files           139      145       +6     
  Lines          6073     6593     +520     
  Branches        469      518      +49     
============================================
+ Hits           4398     4506     +108     
- Misses         1464     1874     +410     
- Partials        211      213       +2     
Flag Coverage Δ Complexity Δ
#cli 80.30% <ø> (ø) 0.00 <ø> (ø)
#plugin 66.88% <22.59%> (-4.48%) 1324.00 <42.00> (+34.00) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...distroforelasticsearch/ad/model/DateTimeRange.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
.../handler/ValidateAnomalyDetectorActionHandler.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...rch/ad/transport/AnomalyResultTransportAction.java 78.46% <ø> (ø) 59.00 <0.00> (ø)
...stroforelasticsearch/ad/util/RestHandlerUtils.java 100.00% <ø> (ø) 13.00 <0.00> (ø)
...rch/ad/rest/RestValidateAnomalyDetectorAction.java 63.63% <63.63%> (ø) 3.00 <3.00> (?)
...stroforelasticsearch/ad/model/AnomalyDetector.java 88.17% <88.73%> (-0.72%) 63.00 <25.00> (+17.00) ⬇️
...troforelasticsearch/ad/model/ValidateResponse.java 95.65% <95.65%> (ø) 8.00 <8.00> (?)
...stroforelasticsearch/ad/AnomalyDetectorPlugin.java 93.61% <100.00%> (+0.06%) 10.00 <0.00> (ø)
...oforelasticsearch/ad/model/ValidationFailures.java 100.00% <100.00%> (ø) 3.00 <3.00> (?)
...ticsearch/ad/model/ValidationSuggestedChanges.java 100.00% <100.00%> (ø) 3.00 <3.00> (?)
... and 7 more

@kaituo
Copy link
Member

kaituo commented Sep 3, 2020

Unit test coverage drops and cause workflow to fail. Is it possible to write unit tests to cover it? Or can you let the workflow count your IT tests to the new coverage?

uiMetadata,
schemaVersion,
lastUpdateTime,
true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only place that the method is different from the parse method?

Comment on lines +83 to +93
protected static final String AGG_NAME_MAX = "max_timefield";
protected static final int NUM_OF_INTERVAL_SAMPLES = 128;
protected static final int MAX_NUM_OF_SAMPLES_VIEWED = 128;
protected static final int NUM_OF_INTERVALS_CHECKED = 256;
protected static final double SAMPLE_SUCCESS_RATE = 0.75;
protected static final int FEATURE_VALIDATION_TIME_BACK_MINUTES = 10080;
protected static final int NUM_OF_INTERVALS_CHECKED_FILTER = 384;
protected static final long MAX_INTERVAL_LENGTH = (30L * 24 * 60 * 60 * 1000);
protected static final long HISTORICAL_CHECK_IN_MS = (90L * 24 * 60 * 60 * 1000);
protected static final String NAME_REGEX = "[a-zA-Z0-9._-]+";
protected static final double INTERVAL_RECOMMENDATION_MULTIPLIER = 1.2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add documentation on how the numbers are determined?

}
String error = RestHandlerUtils.validateAnomalyDetector(anomalyDetector, maxAnomalyFeatures);
if (StringUtils.isNotBlank(error)) {
List<String> dupErrorsFeatures = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error returned by RestHandlerUtils.validateAnomalyDetector is not just about duplicate feature. Can we rename related variable and enum?


private void checkADNameExists() {
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
boolQueryBuilder.must(QueryBuilders.termQuery("name.keyword", anomalyDetector.getName()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have used the constant "name.keyword" in src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/IndexAnomalyDetectorActionHandler.java as well. Can you declare a variable in sth like com.amazon.opendistroforelasticsearch.ad.constant.CommonName and reference it?

.aggregation(AggregationBuilders.max(AGG_NAME_MAX).field(anomalyDetector.getTimeField()))
.size(1)
.sort(new FieldSortBuilder(anomalyDetector.getTimeField()).order(SortOrder.DESC));
SearchRequest searchRequest = new SearchRequest().indices(anomalyDetector.getIndices().get(0)).source(searchSourceBuilder);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have multiple indices?

Map<String, List<Map<String, ?>>> suggestionsMap = (Map<String, List<Map<String, ?>>>) XContentMapValues
.extractValue("suggestedChanges", responseMap);
assertTrue(failuresMap.keySet().size() == 1);
assertTrue(failuresMap.containsKey("others"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also check the value?

.extractValue("suggestedChanges", responseMap);
assertTrue(failuresMap.keySet().size() == 0);
assertTrue(suggestionsMap.keySet().size() == 1);
assertTrue(suggestionsMap.containsKey("detection_interval"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also check the value?

.extractValue("suggestedChanges", responseMap);
assertTrue(failuresMap.keySet().size() == 0);
assertTrue(suggestionsMap.keySet().size() == 1);
assertTrue(suggestionsMap.containsKey("feature_attributes"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check the value?

XContentBuilder xContentBuilder = builder.startObject();

xContentBuilder.startObject("failures");
for (String key : failures.keySet()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if failures or suggestedChanges are null?

.xContent()
.createParser(xContent, LoggingDeprecationHandler.INSTANCE, feature.getAggregation().toString());
parser.nextToken();
List<String> fieldNames = parseAggregationRequest(parser);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if fieldNames's size is 0?

@kaituo
Copy link
Member

kaituo commented Sep 4, 2020

Unit test coverage drops and cause workflow to fail. Is it possible to write unit tests to cover it? Or can you let the workflow count your IT tests to the new coverage?

src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java may require more unit tests.

Base automatically changed from master to main February 8, 2021 17:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants