Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create performance tests [WEB-1458] #7741

Merged
merged 26 commits into from Aug 30, 2023
Merged

feat: Create performance tests [WEB-1458] #7741

merged 26 commits into from Aug 30, 2023

Conversation

julian-determined-ai
Copy link
Contributor

@julian-determined-ai julian-determined-ai commented Aug 25, 2023

Description

The goal of this PR is to both introduce load tests for the purpose of benchmarking the current performance of our system and implement a system that will easily allow us to implement future tests. This PR will describe the current testing setup and future possibilities and enhancements, and add a few important notes about k6 that may be important in future updates.

For reference there is a prototype branch that contains a more in-depth setup for the load tests the branch is web-1458-prototype.

Current Setup

Running the test

  1. Install k6 and the junit2html python package.
    (within performance/determined)
  2. npm install to install dependencies
  3. npm start to build the 'api_performance_tests.js` test file
  4. k6 run -e DET_MASTER=http://localhost:8080 build/api_performance_tests.js to run the file built in (3) the DET_MASTER` env var will set the url for the test cluster
  5. A junit.xml file will be generated containing a test report.
  6. junit2html junit.xml to create an html report from the generated xml.

The results of the test are:

  1. Console output of metrics collected (example shown below).
  2. A junit.xml file with pass/fail test results as well as http request duration statistics for each test.
  3. A html file created from (2) with similar information.

To note, I had originally planned on implementing the test schema in web-1458-prototype but after discussions with @ashtonG we decided that it was a bit excessive for the ultimate goal of this ticket.

Example Results

Console output example:
653B195D-F509-4ECD-A6BD-2B7972C2470A

jUnit output example (html version):
815CF64C-66E2-4E6C-BE74-E798B2C1C4B3

Test Logic

The current testing setup allows us to benchmark the current system performance by implementing a single "average load" test that simulates a ramping number of queries to the master. The current test will simulate 25 users querying the master endpoint. The test will ramp up to 25 users over the course of 5 minutes, Then sustain that request rate for 10 minutes, then ramp down to 0 users over a period of 5 minutes. The total test runtime is 20 minutes This is to simulate an average load on the system. The 25 users was based on of the number of users that recursion has in their system which is about 20.

k6 allows for setting important thresholds for a given test. Currently, two thresholds are set. First, I added a request failed threshold that will abort and fail the test if more than 1 percent of the HTTP requests fail. The idea being that if we are seeing that many tests fail we likely will want to investigate the cause and should not allow the test to pass.

Secondly, I have added a threshold for the request duration, the threshold expects more than 95% of all http requests to have a duration of less than 1 second, which is the overall performance goal for our system. The test suite is currently setup so that tests will not fail if this threshold is crossed. However, this gives us the ability to easily view this metric in test reports.

Sample Extension

Additional tests can be added using the test construct created in this PR. An example extra test to query the telemetry endpoint is:

    test(
        'visit telemetry endpoint',
        () => {
            const res = http.get(generateEndpointUrl('/api/v1/master/telemetry', clusterURL));
            check(res, { '200 response': (r) => r.status == 200 });
        }
    )

example output after the test addition (the testing stages were made shorter for example purposes):

console output:
Screen Shot 2023-08-30 at 3 00 43 PM

jUnit output example (html version):
Screen Shot 2023-08-30 at 3 01 24 PM

Test Plan

Commentary

Future Possibilities and Enhancements and Important Notes

Reference: web-1458-prototype

Test Structure

Load testing will often implement different types of test scenarios in order to track system performance under different test situations. The most common scenarios being smoke, average load, stress, soak, spike, and breakpoint tests.

An example set up k6 would look similar to this:

const scenarios: { [name: string]: Scenario } = {
    smoke: {
        executor: 'shared-iterations',
        vus: 5,
        iterations: 5
    }, average_load: {
        executor: 'ramping-vus',
        stages: [
            { duration: '10s', target: 50 },
            { duration: '60s', target: 50 },
            { duration: '10s', target: 0 }
        ],
        startTime: "5s"
    },
    stress: {
        executor: 'ramping-vus',
        stages: [
            { duration: '10s', target: 175 },
            { duration: '20s', target: 175 },
            { duration: '10s', target: 0 }
        ],
        startTime: "90s"
    },
    soak: {
        executor: 'ramping-vus',
        stages: [
            { duration: '5s', target: 50 },
            { duration: '1m', target: 50 },
            { duration: '1m', target: 0 }
        ],
        startTime: "135s"
    },
    spike: {
        executor: 'ramping-vus',
        stages: [
            { duration: '1m', target: 500 },
            { duration: '15s', target: 0 },
        ],
        startTime: "265s"
    },
}

example console output:
84B51C06-AC27-437A-B6F2-9FFAAF0CF73E

example jUnit output:
7EB467A7-B5B2-4CB3-B809-44136DB99822

in the above example virtual users are spun up and down over a specified duration of time to simulate variations in web traffic. You can reference the k6 scenario documentation to learn more about how this configuration works.

In the future we will likely want to move to this sort of test scheme so that we can gather a holistic view of our system performance under different load types. Additionally, we will likely want to implement much longer running tests for some scenarios.

User Initialization

During implementation planning with @stoksc we discussed that we will likely want to be able to track unique users per test. For example, we might want to track performance for RBAC users with differing permissions. k6 has a few utilities using unique data within tests however there are some caveats. The largest one being that k6 does not allow for making http requests during the test initializing phase, meaning we cannot implement logic such as:

  1. Before the start of the test suite query the cluster for the current set of users
  2. Assign each virtual user to a determined user from (1)

there are alternative workflows we could implement but I did find it worth calling this fact out.

The current setup does not implement any sort of login or unique user configuration.

API Bindings and Typescript

The k6 recommended k6-template-typescript project was used to generate a typescript project for our test suite. The current setup does not use our generated typescript bindings but in the future we may want to, this was one of the main considerations for making this a typescript project. I wanted to add this note since it came up in discussions with @loksonarius

Result Reporting

There are a few quirks around metric reporting that were found during this implementation that are worth calling out.

Limitations around reporting results within k6

k6 gives the ability to tag and group tests in various ways. Tests can easily be tagged via custom tags, endpoint, groups, etc. k6 gives the ability to render a custom report output via a handle summary method that is defined within the test suite. However, all information and details regarding tags are scrubbed from the data that the handleSummary method receives. You can read more in this github issue grafana/k6#1321 as well as this thread about the lack of tag data: https://community.grafana.com/t/show-tag-data-in-output-or-summary-json-without-threshold/99320

for example no matter how you tag any metrics, even custom metrics, the data available for writing the report will look as follows:

"http_req_duration":{
         "type":"trend",
         "contains":"time",
         "values":{
            "p(95)":2.2873,
            "avg":1.0244,
            "min":0.353,
            "med":0.5935,
            "max":2.308,
            "p(90)":2.2666
         }
      },

as you can see there are no mentions of any tags. The workaround is to add thresholds for each tag that you want to follow, this will cause k6 to show more information regarding the tag in the output. A code example can be seen here:https://github.com/determined-ai/determined/compare/web-1458-prototype#diff 31e2b17ee608e49eacf18f0b0b17988d36964f621b588600e701bfce8466649aR69 and you can see in the example outputs above how the tag information becomes available in the test output.

Future Results Reporting in k6

Thankfully, all information regarding tags is kept within the individual data points created during testing. This data is what will be sent to Grafana for example when we want to implement external result viewing, so we will still be able to build custom dashboards and charts when we decide to enable viewing results sent to a time series db.

Additionally, the individual data points mentioned above can be written to a json or csv file. In the future we could write custom file parsing logic to build a more in depth report from the data in the output file

For reference here is an example point from the file mentioned above, you will notice that all tag data is present.

{
   "metric":"http_reqs",
   "type":"Point",
   "data":{
      "time":"2023-08-25T13:12:56.940746-05:00",
      "value":1,
      "tags":{
         "expected_response":"true",
         "group":"",
         "method":"GET",
         "name":"http://localhost:8080/api/v1/master",
         "proto":"HTTP/1.1",
         "scenario":"smoke",
         "status":"200",
         "test":"visit master endpoint",
         "url":"http://localhost:8080/api/v1/master"
      }
   }
}

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

WEB-1458

@cla-bot cla-bot bot added the cla-signed label Aug 25, 2023
@netlify
Copy link

netlify bot commented Aug 25, 2023

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit ce42d07
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/64ef8b8773685f000853dd83

performance/package.json Outdated Show resolved Hide resolved
performance/yarn.lock Outdated Show resolved Hide resolved
Comment on lines +9 to +22
"@babel/core": "7.13.16",
"@babel/plugin-proposal-class-properties": "7.13.0",
"@babel/plugin-proposal-object-rest-spread": "7.13.8",
"@babel/preset-env": "7.13.15",
"@babel/preset-typescript": "7.13.0",
"@types/k6": "^0.45.3",
"@types/webpack": "5.28.0",
"babel-loader": "8.2.2",
"clean-webpack-plugin": "4.0.0-alpha.0",
"copy-webpack-plugin": "^9.0.1",
"typescript": "4.2.4",
"webpack": "5.76.1",
"webpack-cli": "5.0.1",
"webpack-glob-entries": "^1.0.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that the webui uses vite/esbuild and this has us using webpack/babel, consider tagging some work to migrate this to vite using library mode: https://vitejs.dev/guide/build.html#library-mode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance/src/load_tests.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@hkang1 hkang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I was able to run through all the steps and see the results and JUNIT export. Let some comments.

How did you generate the JUNIT html pages?

performance/package-lock.json Show resolved Hide resolved
Comment on lines +9 to +22
"@babel/core": "7.13.16",
"@babel/plugin-proposal-class-properties": "7.13.0",
"@babel/plugin-proposal-object-rest-spread": "7.13.8",
"@babel/preset-env": "7.13.15",
"@babel/preset-typescript": "7.13.0",
"@types/k6": "^0.45.3",
"@types/webpack": "5.28.0",
"babel-loader": "8.2.2",
"clean-webpack-plugin": "4.0.0-alpha.0",
"copy-webpack-plugin": "^9.0.1",
"typescript": "4.2.4",
"webpack": "5.76.1",
"webpack-cli": "5.0.1",
"webpack-glob-entries": "^1.0.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

performance/src/load_tests.ts Outdated Show resolved Hide resolved
performance/src/load_tests.ts Outdated Show resolved Hide resolved
performance/src/load_tests.ts Outdated Show resolved Hide resolved
performance/src/load_tests.ts Outdated Show resolved Hide resolved
performance/webpack.config.js Show resolved Hide resolved
@julian-determined-ai
Copy link
Contributor Author

This looks great! I was able to run through all the steps and see the results and JUNIT export. Let some comments.

How did you generate the JUNIT html pages?

Another things to note, k6 does not really have native support for outputting to different output types. They do recommend some (helpful libraries)[https://github.com/grafana/awesome-k6] however the bulk of them are small repos maintained by a single individual which I don't think we would want to depend on ourselves. Similar to your comment here: #7741 (comment)

The HTML was actually created using a python library junit2html.

I support adding a CI step to create the html report from the xml and adding that as an artifact in circleCi. The report that is generated off the shelf is decent, it does allow us to easily view failures/passes. But I can imagine we might want to make better reports in the future. How does adding an extra CI step to create the html file sound @hkang1?

@hkang1
Copy link
Contributor

hkang1 commented Aug 30, 2023

That sounds great, having it as an artifact would be sweet!

Copy link
Contributor

@hkang1 hkang1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@julian-determined-ai julian-determined-ai merged commit f8caa0e into main Aug 30, 2023
76 of 87 checks passed
@julian-determined-ai julian-determined-ai deleted the web-1458 branch August 30, 2023 21:59
@dannysauer dannysauer added this to the 0.25.1 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants