Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"sliding window" thresholds #2379

Open
LaserPhaser opened this issue Feb 7, 2022 · 4 comments
Open

"sliding window" thresholds #2379

LaserPhaser opened this issue Feb 7, 2022 · 4 comments
Labels

Comments

@LaserPhaser
Copy link

LaserPhaser commented Feb 7, 2022

Feature Description

The current implementation of the threshold mechanism works only for absolute values.
For example:
When I set "autostop" for 5% errors, it means that I need to collect 5% of the errors from the whole run.
But usually, degradations happen when RPS became really high.
And if you get to this amount of RPS step by step, you need to wait for some time and it will be for example
100% of errors for the last 1 minute, which results in 10% of errors for the whole run.

As a numeric example:
we run the following configuration:
10 RPS for 1 minute - it will cause in a total of 60 requests - everything is 200 -ok
20 RPS for 1 minute - it will cause in a total of 120 requests - everything is 200 -ok
30 RPS for 1 minute - it will cause in a total of 180 requests - everything is 200 -ok
50 RPS for 1 minute - it will cause in a total of 300 requests - everything is 200 -ok
60 RPS for 1 minute - it will cause in a total of 360 requests - and here became crash on last 10 sec - so 300 RPS ok and 60 error

So finally we have
60+120+180+300+300 = 960 - "200 ok"
and 60 - "500 fails"

This 60 will be only about 5.8% of the total.

But for the last 10 seconds, it will be a 100% error rate.

Suggested Solution (optional)

My suggestion is to add "sliding windows" for thresholds.
For example, I could be interested in "error rate" only for the last 1 minute or even 10 seconds.
Something like:

export const options = {
  thresholds: {
    http_req_failed: ['rate<0.01[1m]'], // http errors should be less than 1% for the last 1m
    http_req_duration: ['p(95)<200[10m]'], // 95% of requests should be below 200ms for the last 10min
  },
};

Already existing or connected issues / PRs (optional)

No response

@LaserPhaser LaserPhaser changed the title Thresholds for "sliding window" "sliding window" thresholds Feb 7, 2022
@na--
Copy link
Member

na-- commented Feb 7, 2022

This is somewhat of a duplicate of #1136, but it's much better explained (:blush:) and the other issue has become more of a catch-all that just collects various semi-related threshold improvement ideas, so I'll leave both open for now...

Implementing this efficiently will be quite complicated though. Sliding time windows are probably easy and efficient to implement for Counter metrics, but not so much for Trend ones... And I have no idea how HDR histograms (#763) will work with them 😕 The syntax might also look different from what you propose - there are other issues with the current threshold syntax and we might adopt a v2 syntax that resembles something like PromQL, for example... 🤷‍♂️ Still, it's definitely a very valid use case we need to address, so thank you for opening such a detailed issue.

For now, as a workaround in some situations, you can approach the problem from the opposite direction... Instead of setting thresholds for time windows, you can set the thresholds for specific tags (sub-metrics) and use the recently introduced ability to manually set VU-wide custom metric tags through the vu.tags property from k6/execution. You can set different tag values based on the current test execution time, e.g. here's how you can tag metrics based on the stage the script is currently in: #796 (comment) It's not the same and it's much less flexible than sliding time windows, but it's a viable workaround for some simpler cases.

@LaserPhaser
Copy link
Author

@na-- maybe we can use https://pkg.go.dev/github.com/RussellLuo/slidingwindow#section-readme for example?
I think I can implement a sliding window for "rate" with this library as Proof of Concept of the feature.

@na--
Copy link
Member

na-- commented Feb 10, 2022

maybe we can use https://pkg.go.dev/github.com/RussellLuo/slidingwindow#section-readme for example?

I am not sure this specific library could actually be used to calculate the sliding window thresholds for a Rate metric, it seems more like a rate-limiter implementation 😕 Maybe some of its internals can be reused, I don't know, but it doesn't matter all that much for now - that's probably the smallest potential problem I can see with this proposal. I don't want to dissuade you from trying to implement something like this, but there are a lot of issues and current in-progress work that surrounds these parts of k6 and that will probably prevent us from merging any such contribution soon, if ever... 😞

We are currently in the midst of some pretty big threshold refactoring (see #2356 and the connected issues, cc @oleiade), as the first step towards better thresholds. The problem is, we are still not sure about what steps 2, 3 and so on look like yet. We just know that there are plenty of deficiencies with the current thresholds, both in their capabilities and in their syntax, but we don't know exactly what the end goal looks like yet. For example, the syntax v2 might be PromQL-like, it might be something like what you propose (though rate[1m]<0.01 is probably better than rate<0.01[1m] 🤔 ), it might be something completely different 🤷‍♂️

Somewhat connected to the above, we are also in the middle of refactoring how we handle metrics and metric samples. Recently we introduced a metrics registry (#1832) and likely upcoming changes include the tracking of distinct time series (#1831), user control of which metrics and sub-metrics k6 actually emits (#1321), and refactoring in how we store metrics in-memory, likely including transitioning to something like HDR histograms (#763) for Trend metrics.

Finally, thresholds in k6 run are evaluated somewhat differently than thresholds in k6 cloud / distributed tests, since you have multiple streams of metrics to crunch. So, even if the local implementation looks easy, the cloud/distributed execution needs its own evaluation and/or better validation.

All of these things might introduce different tradeoffs and affect how we implement "sliding window" thresholds, and vice-versa. So, it's currently difficult to gauge if any one-off changes like the one you propose in this issue will be in the direction we want to go or in some different direction that ties our hands... 😞

@srperf
Copy link

srperf commented Sep 18, 2023

I would do this with a custom metric.
Create a threshold against it.
During the executions, add the value as it is generated.
Every time we move from one time window to the next increase the metric in an order of magnitude and se the threshold as well. That way the previous values are not in the significative numbers for the threshold to take into account.
One Idea to add up to this situation.

The other I can think is to give the functionality to restart custom metrics and keep the threshold against that metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants