Skip to content
This repository has been archived by the owner on Jun 5, 2023. It is now read-only.

optimize outliers whitelisting by FIRST checking the outliers dictionary and then all the other fields #293

Open
daanraman opened this issue Aug 2, 2019 · 1 comment

Comments

@daanraman
Copy link
Contributor

No description provided.

@detobel36
Copy link
Contributor

Since #308 there is no more order to test the "literal" whitelist value.

Complicated for regex, because dict_values_to_check is a set (no order) to remove duplicate information. To remove that we need to cast into a list and order based on key (that we don't have at this step).

Another solution is to create two sets. Test the first and then the second. But one line could contain multiple regex and one document need to match all the regex. So 3 cases are possibles:

  • first set (only outlier dictionary) match all regex, and is detected like whitelist
  • first set doesn't match any regex, second set is used to detect whitelist or not
  • first set match a part of regex list, second set need to match all regex that aren't match by the first set.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

2 participants