Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for adding http_req_errors metric #1311

Closed
sniku opened this issue Jan 17, 2020 · 10 comments
Closed

Proposal for adding http_req_errors metric #1311

sniku opened this issue Jan 17, 2020 · 10 comments
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature

Comments

@sniku
Copy link
Collaborator

sniku commented Jan 17, 2020

This is a proposal for adding a new, standard metric to k6 called http_req_errors that records the number of failed HTTP requests.

Some of this functionality can be achieved today with a custom metric, but it's so basic that I suggest it should be included by default.

Feature Description

Successful requests often have longer http_req_duration than failed requests. This metric would allow k6 to display a separate http_req_duration for successful requests and separate http_req_duration_errors for failed requests.

It's not possible to define which HTTP status codes should be considered errors because these codes are context-specific. For example, HTTP-404 is not an error if someone is testing for 404s.
This problem can be solved by adding additional configuration options, as described below.

The biggest advantage of this functionality is the UX improvement.
Currently, k6 doesn't distinguish between failed and successful requests, therefore the terminal output for a test that has only failed requests is not distinguishable from a successful test.

With this change, one could see at first glance that the test was/wasn't successful without adding checks and thresholds.

Suggested Solution (optional)

let options = {
   http_errors: ["status>399"],  // default setting
   http_errors_exclude: [404, 403, 1099]
};

k6 terminal output could look similar to this:

    http_reqs..................: 10      0/s
    http_reqs_errors...........: 2 (20%)      2/s
    http_req_duration..........: avg=10.13s   min=10.13s   med=10.13s   max=10.13s   p(90)=10.13s   p(95)=10.13s  
    http_req_duration_errors...: avg=1.13s   min=1.13s   med=1.13s   max=1.03s   p(90)=10.13s   p(95)=10.13s  

@sniku sniku added the feature label Jan 17, 2020
@na-- na-- added the evaluation needed proposal needs to be validated or tested before fully implementing it in k6 label Jan 17, 2020
@na--
Copy link
Member

na-- commented Jan 20, 2020

There's a lot to unpack here, but as I mentioned on Slack, I'm very much for addressing the very real UX issues you mentioned, but against some of the specific approaches to do so you proposed. I'll start by sharing all of the potential issues with the proposal I can think of, then hopefully we can arrive at a better alternative that achieves the same goals.

Some of this functionality can be achieved today with a custom metric, but it's so basic that I suggest it should be included by default.

As far as I can see, all of the things you propose could be achieved with custom metrics, even the ones I disagree with 😉. And while I agree that we could definitely improve the UX and offer more flexibility for detecting failures, I disagree that it's basic, since there are a lot of complicated details to consider.

This is a proposal for adding a new, standard metric to k6 called http_req_errors that records the number of failed HTTP requests.

I'm not against adding a new metric, but:

  1. We already have an HTTP request Counter metric, http_reqs, this would just be a sub-metric to it. I added another issue (Allow more flexible sub-metrics and better UX in thresholds #1313) that would actually make that metric more useful, but I don't think we need another counter here, because...
  2. most users would probably be more concerned with the error rate, not the absolute number of errors, so if we're adding a new metric, it should be a Rate one. And when we implement Enhance available threshold values for Rate, Gauge and Trend metrics #1312, if someone is actually interested in the absolute number of errors, they would be able to also reference that too.

Successful requests often have longer http_req_duration than failed requests. This metric would allow k6 to display a separate http_req_duration for successful requests and separate http_req_duration_errors for failed requests.

I'm firmly against adding a separate http_req_duration_errors metric. My biggest reason for that is that I doubt someone would actually care for the values of that _errors metric at all... And if users want to only have a threshold for the response times of successful requests, they can mostly do it even now, by just specifying the threshold like this: thresholds: { "http_req_duration{status:200}": ["p(95)<100"] }. It's not very flexible, but if that's a big problem, we should just improve the threshold matching flexibility (#1313), not just add new metrics willy-nilly, since there would be no end of the metrics we'd like to add otherwise...

In any case, I think http_req_duration is a bit of a red herring in this case. Correct me if I'm wrong, but it seems to me that the problem isn't that http_req_duration becomes too small when there are errors. Rather, the problem is that there are errors and we're not detecting them (i.e. our thresholds pass and the CI status is green). So, I think users should just be able to set a threshold for the HTTP error rate, for example thresholds: { "http_req_errors": ["rate<0.01"] }. Then the http_req_duration values don't matter, since this other threshold would be triggered and fail the test.

And, similar to what we plan for #877, we might also add a default HTTP error threshold. We should just make sure users have a way to overwrite or disable whatever default option we pick.

It's not possible to define which HTTP status codes should be considered errors because these codes are context-specific. For example, HTTP-404 is not an error if someone is testing for 404s.
This problem can be solved by adding additional configuration options, as described below.

In k6, problems are rarely solved by adding new global config options... 😉 #883 is a partial compilation of some of the reasons why that is the case... 😅

In this particular case, some parts of your load tests might treat 403/404/etc. errors as expected, while others might consider them actual errors, so global configuration options seem particularly unsuited for this. Instead, I think configuring this should probably be handled with the new JS HTTP API (issue... soon... 🤦‍♂️), probably at the Client level. In any case, something very similar to this conundrum has already been posted in k6 before: #1298

The biggest advantage of this functionality is the UX improvement.
Currently, k6 doesn't distinguish between failed and successful requests, therefore the terminal output for a test that has only failed requests is not distinguishable from a successful test.

I agree, and a new http_req_errors Rate metric should be fairly visible, especially if we automatically add a default threshold. Though the current UX for displaying rates in the end-of-test summary is a bit confusing and meant to display positive rates:

    checks.....................: 100.00% ✓ 25   ✗ 0
  ✓ http_req_errors............: 0.00%   ✓ 0    ✗ 10

0 errors is a good thing, but ✓ 0 ✗ 10 is a bit confusing...

Suggested Solution (optional)

let options = {
   http_errors: ["status>399"],  // default setting
   http_errors_exclude: [404, 403, 1099]
};

As I mentioned, I think these shouldn't be global config options, they should probably be a part of the new JS HTTP API. And while I agree with the default value of considering everything above 399 an error by default, we probably need some better way of specifying the rule. Maybe some sort of a JS-callback that evaluates HTTP responses before k6 emits metrics, maybe a more powerful rule engine (along the lines of #1313)...

We also shouldn't forget that non-2xx HTTP statuses isn't the only way an HTTP request can fail, network/DNS/protocol/etc. problems are also a possible error cause. So, whatever way we pick to filter these out, we should probably be able to handle these as well...

Also, as I mentioned in #1298, maybe we're looking at this from the wrong angle. We have to consider both issues together, since it might be enough to just let users tag requests that they expect to fail with some new system tag - then, if k6 sees such a tag, it won't add a true value to the new http_req_errors metric?

@sniku
Copy link
Collaborator Author

sniku commented Jan 27, 2020

I agree that most of this functionality can be achieved today, with the current version of k6 if

  1. the end-user spends enough time to learn about the default capabilities of k6,
  2. implemented custom Rate metric for errors,
  3. custom Trend metric for response time, and
  4. add an if condition for every http response.

It can be done, but strongly disagree that it should be.

For the sake of the argument, here's the simplest implementation I could come up with for a script with one batch and one regular request (without resorting to helper utils).

import http from "k6/http";
import { sleep } from "k6";
import { Rate, Trend } from "k6/metrics";

let req_success_dur = Trend("req_success_dur");
let my_httpx_errors = Rate("my_httpx_errors", false); // isTime=false
let test_trend = Trend("test_trend"); // isTime=false

export let options = {
  thresholds: {
    'my_httpx_errors': [ 
      'rate<0.1'
    ],
    'req_success_dur': [
      'p(95)<1000'
    ]
  },
  stages: [
    { duration: "2m", target: 100 }, // below normal load
    { duration: "5m", target: 100 }, 
    { duration: "2m", target: 200 }, // normal load
    { duration: "5m", target: 200 },
    { duration: "2m", target: 300 }, // around the breaking point
    { duration: "5m", target: 300 },
    { duration: "2m", target: 400 }, // beyond the breaking point
    { duration: "5m", target: 400 }, 
    { duration: "10m", target: 0 },  // scale down. Recovery stage.
  ]  
}

export default function() {
  let responses = http.batch(
    [
      ["GET", "https://httpbin.test.loadimpact.com/status/503"],
      ["GET", "https://httpbin.test.loadimpact.com/status/403"],
      ["GET", "https://httpbin.test.loadimpact.com/status/404"],
      ["GET", "https://httpbin.test.loadimpact.com/status/201"],
    ]
  )

  Object.values(responses).forEach(resp => {
    if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
      my_httpx_errors.add(true);
    }
    else {
      req_success_dur.add(Math.round(resp.timings.duration));
      my_httpx_errors.add(false);
    }
  });

  let resp = http.get('https://httpbin.test.loadimpact.com/status/200')

  if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
    my_httpx_errors.add(true);
  }
  else {
    req_success_dur.add(Math.round(resp.timings.duration));
    my_httpx_errors.add(false);
  }

  sleep(1);
}

This script works. It records http_req_success_duration and http_reqs_errors, but it's not pretty or easy to maintain.
I can't imagine recommending this approach to users without some discomfort 😅

I agree that http_req_duration_errors is unnecessary. Few people are interested in the response duration of failed requests. I retract my request for this metric. 👍

I agree that the http_req_errors should be a Rate rather than a Counter 👍

Here's the terminal output of the above script.

Selection_004

The new, correct metrics are highlighted in green. The http_req_blocking and friends are still accounting for successful+failed requests. That's not great.
One can replicate this logic for the remaining minor metrics, but at that point, the script and output would be too messy.

Again, I think that k6 should by default record only metrics for successful requests, and the manual workaround implemented above is not a viable solution.

In any case, I think http_req_duration is a bit of a red herring in this case. Correct me if I'm wrong, but it seems to me that the problem isn't that http_req_duration becomes too small when there are errors. Rather, the problem is that there are errors and we're not detecting them (i.e. our thresholds pass and the CI status is green). So, I think users should just be able to set a threshold for the HTTP error rate, for example thresholds: { "http_req_errors": ["rate<0.01"] }. Then the http_req_duration values don't matter, since this other threshold would be triggered and fail the test.

When doing stress testing, errors are expected to appear and you want to continue the test for an extended period of time to see when the system recovers. I believe that it's still useful to know what the http_req_duration is for successful requests when 90% of other requests are failing. It helps to determine the stress failure mode. If errors are included in http_req_duration the metric isn't meaningful.

Another reason for implementing something like this:

let options = {
   http_errors: ["status>399"],  // default setting
   http_errors_exclude: [404, 403, 1099]
};

is for the Cloud UI to be able to understand which requests should be considered successful/failed.

Today the cloud UI is assuming that status code >= 400 is a failure, and marks it red in the UI. This is clearly incorrect in many cases. Cloud should be able to understand these settings and make use of them for UI and performance alerts.

The error metric should be displayed in the cloud by default, with the same prominence as http_req_duration.

A complementary functionality would be to introduce a change to the HTTP API. I purposefully didn't mention it in the original issue description because that's a rabbit hole 😄
In any case, something like this would be ideal for exceptions that are not handled by the global options configuration.

http.get({url: "https://httpbin.test.loadimpact.com/status/503", expect: {status: 503}});

Even though this is a more flexible solution, I suggest we don't do that as a part of this work. I think that the options solution is solving the problem in 95% of the cases,
and the change to the HTTP API can still be implemented at a later date.

(...) we probably need some better way of specifying the rule. Maybe some sort of a JS-callback that evaluates HTTP responses before k6 emits metrics, maybe a more powerful rule engine (along the lines of #1313)...

Whatever format we decide on, the format needs to be readable by the cloud. It would be great if it didn't depend on JS.

We also shouldn't forget that non-2xx HTTP statuses isn't the only way an HTTP request can fail, network/DNS/protocol/etc. problems are also a possible error cause. So, whatever way we pick to filter these out, we should probably be able to handle these as well...

That's a good point. Perhaps we should rename the http_req_errors to req_errors to record all failures rather than just http failures. What do you think?

Also, as I mentioned in #1298, maybe we're looking at this from the wrong angle. We have to consider both issues together, since it might be enough to just let users tag requests that they expect to fail with some new system tag - then, if k6 sees such a tag, it won't add a true value to the new http_req_errors metric?

#1298 would benefit from the HTTP API change I'm suggesting above, but I think the options solution would also work for them. We could go deeper into discussing the HTTP change, but I'm afraid of 🐰 holes 😄

And if users want to only have a threshold for the response times of successful requests, they can mostly do it even now, by just specifying the threshold like this: thresholds: { "http_req_duration{status:200}": ["p(95)<100"] }. It's not very flexible, but if that's a big problem, we should just improve the threshold matching flexibility (#1313), not just add new metrics willy-nilly, since there would be no end of the metrics we'd like to add otherwise...

Errors are fundamental to load/stress testing. We should track and display errors by default in k6 UI and cloud UI. This should not be left for the users to define and implement on their end as a special case. I believe we should also have default thresholds based on the rate of errors.

I'm sorry about the length of this comment. I need to learn how to be more concise 😅

@na--
Copy link
Member

na-- commented Jan 28, 2020

It can be done, but strongly disagree that it should be.

Completely agree.

For the sake of the argument, here's the simplest implementation I could come up with for a script with one batch and one regular request (without resorting to helper utils).

I'd say that something like this is a perfect place for helper utils 😉 Or, more precisely, a wrapper around http.batch() / http.request() that implements whatever logic is important to you. And, in general, if such logic is not easy or straightforward to write as a wrapper, it would be much more difficult for us to actually implement as a generic k6 feature... So here's how I would write that script, which I'd argue is very readable (especially if I move the wrappers in a separate library):

import http from "k6/http";
import { sleep } from "k6";
import { Rate, Trend } from "k6/metrics";

let req_success_dur = Trend("req_success_dur");
let my_httpx_errors = Rate("my_httpx_errors", false); // isTime=false

function checkResponse(resp) {
    if (resp.status >= 400 && resp.status != 404 && resp.status != 403) {
        my_httpx_errors.add(true);
        return false;
    }
    req_success_dur.add(Math.round(resp.timings.duration));
    my_httpx_errors.add(false);
    return true;
}

function myGet(url, params) {
    let resp = http.get(url, params);
    checkResponse(resp);
    return resp;
}

function myBatch(reqs) {
    var resps = http.batch(reqs);
    Object.values(resps).forEach(checkResponse);
    return resps;
}

export let options = {
    thresholds: {
        'my_httpx_errors': [
            'rate<0.1'
        ],
        'req_success_dur': [
            'p(95)<1000'
        ]
    },
    stages: [
        { duration: "2m", target: 100 }, // below normal load
        { duration: "5m", target: 100 },
        { duration: "2m", target: 200 }, // normal load
        { duration: "5m", target: 200 },
        { duration: "2m", target: 300 }, // around the breaking point
        { duration: "5m", target: 300 },
        { duration: "2m", target: 400 }, // beyond the breaking point
        { duration: "5m", target: 400 },
        { duration: "10m", target: 0 },  // scale down. Recovery stage.
    ]
}

export default function () {
    myBatch([
        ["GET", "https://httpbin.test.loadimpact.com/status/503"],
        ["GET", "https://httpbin.test.loadimpact.com/status/403"],
        ["GET", "https://httpbin.test.loadimpact.com/status/404"],
        ["GET", "https://httpbin.test.loadimpact.com/status/201"],
    ]);

    myGet('https://httpbin.test.loadimpact.com/status/200')

    sleep(1);
}

I can't imagine recommending this approach to users without some discomfort sweat_smile

While, as I said, I'd like to improve these parts of k6 to handle the most common use cases in a better way (while also allowing the handling of more complex ones with some work), I don't think something like the above wrapper is too onerous of a workaround until then.

I agree that http_req_duration_errors is unnecessary. Few people are interested in the response duration of failed requests. I retract my request for this metric. 👍
I agree that the http_req_errors should be a Rate rather than a Counter 👍

👍 🤝

Here's the terminal output of the above script.

Selection_004

The new, correct metrics are highlighted in green. The http_req_blocking and friends are still accounting for successful+failed requests. That's not great.
One can replicate this logic for the remaining minor metrics, but at that point, the script and output would be too messy.

Agree, the current output is messy. I'll respond to your other issue (#1319) later, but in general I agree that we shouldn't show most of the HTTP metrics we currently show, unless users explicitly request them or have a threshold on them. But we can't remove them until we have a way that allows users to add them back... So, I guess this is tracked in #570, for lack of a better issue.

Again, I think that k6 should by default record only metrics for successful requests, and the manual workaround implemented above is not a viable solution.

Maybe, I can see how this makes sense, though I think this should be configurable... For example, you're thinking in terms of application errors (i.e. HTTP status code >= 400), but there are also network/DNS/HTTPS/etc. issues, for which the duration measurement might be more important... And we shouldn't forget intentional/unavoidable errors, which we should also measure (#1298), and that decision has to be taken locally, not be a global option.

In any case, I can see use cases for both measuring "errors" and for not measuring them, so, in the interest of not locking out any users, I think this should be configurable... We can have a flag (and other options) in the new JS HTTP API that enables/disables this behavior. In that API, not measuring them can also be the default value, if we think that's the generally correct approach, but in the current API we should preserve the current behavior, I think...

In any case, I think http_req_duration is a bit of a red herring in this case. Correct me if I'm wrong, but it seems to me that the problem isn't that http_req_duration becomes too small when there are errors. Rather, the problem is that there are errors and we're not detecting them (i.e. our thresholds pass and the CI status is green). So, I think users should just be able to set a threshold for the HTTP error rate, for example thresholds: { "http_req_errors": ["rate<0.01"] }. Then the http_req_duration values don't matter, since this other threshold would be triggered and fail the test.

When doing stress testing, errors are expected to appear and you want to continue the test for an extended period of time to see when the system recovers. I believe that it's still useful to know what the http_req_duration is for successful requests when 90% of other requests are failing. It helps to determine the stress failure mode. If errors are included in http_req_duration the metric isn't meaningful.

Hmm you're right, this is a very valid use case. 👍

Another reason for implementing something like this:

let options = {
   http_errors: ["status>399"],  // default setting
   http_errors_exclude: [404, 403, 1099]
};

is for the Cloud UI to be able to understand which requests should be considered successful/failed.

Today the cloud UI is assuming that status code >= 400 is a failure, and marks it red in the UI. This is clearly incorrect in many cases. Cloud should be able to understand these settings and make use of them for UI and performance alerts.

As I mentioned earlier, I strongly think that such options aren't the best UX for this behavior. And while I wouldn't mind something like this in the new HTTP API, I'm especially against having them in the global options. Even if we ignore #833, it doesn't make sense to me from a UX perspective, since in a single script it's quite possible that you can have some 404/403 errors that are expected and that you want to measure, and others that you consider errors.

So, I think if we just tag the metrics from the expectedly erroneous requests (however those are defined) with a expected_error: true metric tag (or something like that), it will be even easier for the cloud (and everything else, like thresholds especially) to distinguish such requests...

The error metric should be displayed in the cloud by default, with the same prominence as http_req_duration.

I don't understand this point, sorry.

A complementary functionality would be to introduce a change to the HTTP API. I purposefully didn't mention it in the original issue description because that's a rabbit hole smile
In any case, something like this would be ideal for exceptions that are not handled by the global options configuration.

http.get({url: "https://httpbin.test.loadimpact.com/status/503", expect: {status: 503}});

This proposal I actually like 😅 Again, it probably should fall under the new HTTP API, but in my head, this (or something like this) is a nice way to handle it. Basically, the k6 HTTP client can emit expected_error: true for the generated metrics if it sees that the status it received matches the expected one, and the cloud/thresholds/etc. can safely consider the generated metrics.

Even though this is a more flexible solution, I suggest we don't do that as a part of this work. I think that the options solution is solving the problem in 95% of the cases,
and the change to the HTTP API can still be implemented at a later date.

I'm trying to argue that it will be both easier and better (i.e. 100% of cases) not to have global options 😉

(...) we probably need some better way of specifying the rule. Maybe some sort of a JS-callback that evaluates HTTP responses before k6 emits metrics, maybe a more powerful rule engine (along the lines of #1313)...

Whatever format we decide on, the format needs to be readable by the cloud. It would be great if it didn't depend on JS.

Again, we have per-request metrics, it would be much simpler for the cloud to process something it already understands, than to parse (probably with differences from k6...) some new global option.

We also shouldn't forget that non-2xx HTTP statuses isn't the only way an HTTP request can fail, network/DNS/protocol/etc. problems are also a possible error cause. So, whatever way we pick to filter these out, we should probably be able to handle these as well...

That's a good point. Perhaps we should rename the http_req_errors to req_errors to record all failures rather than just http failures. What do you think?

Not sure... My gut feeling is that we shouldn't mix these things, so http_req_errors seems better to me right now. However, this is a bigger discussion regarding how we want to handle different protocols in the future...

Also, as I mentioned in #1298, maybe we're looking at this from the wrong angle. We have to consider both issues together, since it might be enough to just let users tag requests that they expect to fail with some new system tag - then, if k6 sees such a tag, it won't add a true value to the new http_req_errors metric?

#1298 would benefit from the HTTP API change I'm suggesting above, but I think the options solution would also work for them. We could go deeper into discussing the HTTP change, but I'm afraid of rabbit holes smile

I disagree. As far as I understand, they want to consider only some 404 errors as expected. After their resource is created, I don't see why they'd want to ignore any other errors. Also, #1298 (comment) ... 😉

And if users want to only have a threshold for the response times of successful requests, they can mostly do it even now, by just specifying the threshold like this: thresholds: { "http_req_duration{status:200}": ["p(95)<100"] }. It's not very flexible, but if that's a big problem, we should just improve the threshold matching flexibility (#1313), not just add new metrics willy-nilly, since there would be no end of the metrics we'd like to add otherwise...

Errors are fundamental to load/stress testing. We should track and display errors by default in k6 UI and cloud UI. This should not be left for the users to define and implement on their end as a special case. I believe we should also have default thresholds based on the rate of errors.

I agree 👍

I'm sorry about the length of this comment. I need to learn how to be more concise sweat_smile

Me too 😅

@ppcano
Copy link
Contributor

ppcano commented Apr 21, 2020

wow!! I will be brave and participate in this conversation.

I won't discuss the details because I just want to share the cases where I've experienced a default HTTP error metric could be useful:

Simple Threshold on HTTP errors

Define easily thresholds based on HTTP errors without having to define custom metrics.

thresholds: {
    'http_req_errors': ['rate<0.1'],
    'http_req_errors{custom-tag:hola}': ['rate<0.2']
},

Visualize HTTP errors in Datadog

Provide a general visualization for HTTP error rate or error counter in Datadog. This is not possible today because Datadog does not allow to query metric by tags using the OR operator.

status:200 OR status:201 OR status:202

When reviewing other Datadog integrations I found the following rigor error metrics:

  • rigor.real_browser.client_errors: gauge,Number of responses with a status code between 400 and 499.

  • rigor.real_browser.connection_errors: gauge,Number of responses where the status code is 504 or 0 (a browser-aborted request).

  • rigor.real_browser.server_errors: gauge,Number of responses where the status code is 500 or higher (excluding 504).

  • rigor.real_browser.errors: gauge,Total count of responses with status codes greater than or equal to 400.

@sniku
Copy link
Collaborator Author

sniku commented Apr 28, 2020

I'm adding this screenshot from overloaded system to the conversation:
image

Currently, k6 includes HTTP-0 Response time in the end-of-test summary. This is probably not what we want. The HTTP-0 responses have either "0s" duration or "60s" duration (timeout) which skews the results for the requests that have actually finished.
In my opinion, HTTP-0 Responses are errors and should be excluded.

@simskij
Copy link
Contributor

simskij commented Jan 20, 2021

@sniku, @robingustafsson and @na--, can we bump http_reqs_errors to be included in v0.31.0? It would greatly improve the UX of test building, even if it's possible to do it manually today.

@na--
Copy link
Member

na-- commented Jan 20, 2021

We still don't really have an agreement on what http_reqs_errors should be and how it should be configured, and we can't really prioritize it without consensus on these things. I thought @sniku's httpx wrapper on https://jslib.k6.io/ was an attempt to arrive at a nice design for this feature, and also a viable workaround until we have something ready to bake into k6?

And IIRC, that eventual design was tentatively intended to be included into the new k6 HTTP API, not the current one. Not only because we probably want #1321 beforehand and we don't want to make #883 worse with global options, but also because global options will likely be too restrictive for this feature. You might want to treat HTTP errors one way in one scenario and another way in the other scenarios. So, these distinctions need to be made on something like an HTTP-client level, not globally, which is what httpx currently emulates.

@ppcano
Copy link
Contributor

ppcano commented Jan 22, 2021

Prioritization

We still don't really have an agreement

@sniku I think we should spend some time finding an agreement on this API.

And IIRC, that eventual design was tentatively intended to be included into the new k6 HTTP API.

I don't know the status of the new k6 HTTP API but it looks experimental and further in the future. I wonder what are the impediments to include this on k6/http.

API Design

Skipping cosmetics preferences, IMO, the global option together with the per-request expect tag/option solves all the cases with good UX. I believe the global option will be used mainly while the per-request tag will cover edge cases.

Breaking changes

I think we all agree that error response times should be excluded "somehow" from success responses. I raise the concern that if error response times are excluded from http_req_duration, it could break some existing thresholds.

@sniku
Copy link
Collaborator Author

sniku commented Jan 22, 2021

@ppcano, we have discussed this feature during the last k6 sync call, and we have decided to work on the specification. The feature is not as simple as I initially thought, so it takes some time to work out all the edge cases, and address all the valid concerns raised above.

We plan to include this feature in v0.31 release. I'll update this issue or create a new one when the specification has been finalized—hopefully, today.

@na--
Copy link
Member

na-- commented Jan 29, 2021

Closing this in favor of the new issue: #1828

@na-- na-- closed this as completed Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature
Projects
None yet
Development

No branches or pull requests

4 participants