Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce false positive rate of timing tests and add tools for handling them #673

Open
tomato42 opened this issue Jun 16, 2020 · 2 comments
Open
Labels
complex Issues that require good knowledge of tlsfuzzer internals enhancement new feature to be implemented help wanted

Comments

@tomato42
Copy link
Member

tomato42 commented Jun 16, 2020

While we have tests to verify Lucky13 and Bleichenbacher now:

they have quite significant false positive rate (>20%). We should improve the used statistical classifiers, handling of outliers, way the data is collected, etc., so that the false positive rate is more manageable (<5%)

@tomato42 tomato42 added enhancement new feature to be implemented help wanted complex Issues that require good knowledge of tlsfuzzer internals labels Jun 16, 2020
@tomato42
Copy link
Member Author

see also #106

@tomato42
Copy link
Member Author

tomato42 commented Jun 28, 2020

Actually, we should be careful with sample sizes, as too small sample sizes will not show effect sizes that are measurable in practice. See https://stats.stackexchange.com/a/2522/289885 :

In a situation where a "simple" null is tested against a "compound" alternative, as in classic t-tests or z-tests, it typically takes a sample size proportional to 1/ϵ² to detect an effect size of ϵ. There's a practical upper bound to this in any study, implying there's a practical lower bound on a detectable effect size. So, as a theoretical matter der Laan and Rose are correct, but we should take care in applying their conclusion.

i.e. to detect a 1% effect size we need a sample size of 10k, and 1M sample size to detect an effect size of 0.1%

and we need to remember that p-value is independent of sample size: the 5% false positive rate for alpha of 0.05 is a constant

for very large sample sizes and quick response times we may need to look into checking the statistical importance not statistical significance of the result (as a result that tells us that one class is different than another by less that one CPU cycle, then it's not a meaningful result), see https://stats.stackexchange.com/a/7849/289885

@tomato42 tomato42 changed the title Reduce false positive rate of timing tests Reduce false positive rate of timing tests and add tools for handling them Jun 28, 2020
@tomato42 tomato42 mentioned this issue Nov 3, 2020
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complex Issues that require good knowledge of tlsfuzzer internals enhancement new feature to be implemented help wanted
Projects
Vulnerability testers
  
Awaiting triage
Development

No branches or pull requests

1 participant