Skip to content

Code Coverage FAQ

mtail edited this page Apr 22, 2019 · 4 revisions

How does it work?

The job runs all tests under istio.io/istio twice, except those that are blacklisted in codecov.skip. They include all unit tests and those using the integration test framework using a local environment.

In the first pass, the tests are run at the PR head. In the second run the tests are run at the PR base (determined using an github API). A go test then compares the delta of these coverage results, and fails if any file coverage is dropped beyond the pre-configured codecov.threshold, which defines the default istio-wide threshold, and per package/file override.

How long does it run?

Less than 10 minutes typically.

Why is the job PR gating?

We don't want unexpected drop of code coverage.

What signals does it give me?

In addition to the codecov job status (pass/fail), the job now produces useful files in go/out/codecov/ as artifacts. The files generated from tests at PR head can be found at go/out/codecov/pr , and the files generated at the PR base can be found at go/out/codecov/baseline. The codecov.skip and codecov.threshold used by the job are also conveniently stored in go/out/codecov/ for reference. See the following example:

In particular, go/out/codecov/pr/coverage.html and go/out/codecov/baseline/coverage.html show you graphical coverage views of every tracked files. Example:

What happens if the codecov job fails in my PR?

Look at the test log. If you see something like the following, coverage is dropped unexpectedly.

--- FAIL: TestCheckCoverage (5.74s)
       main_test.go:200: Coverage dropped:
              istio.io/istio/pilot/pkg/model/push_context.go:-6.400000% (80.600000% to 74.200000%)

You can compare the generated go/out/codecov/pr/coverage.html and go/out/codecov/baseline/coverage.html to visually compare the difference.

The best action is to add more tests to restore the coverage. If the drop is due to running tests in non production packages, you can skip it by updating codecov.skip. And if you believe the drop is expected, or if the failure is due to flaky coverage (see known issues below), you can update codecov.threshold to relax tolerate of the file or package. Both these files are in the root directory on purpose, and changing them requires top level approval.

The codecov job also fails if any of the tests fails. Please check the other circle test jobs and make sure your PR has not broken any test.

Do we still use codecov.io?

We no longer upload coverage data to codecov.io in pre-submit. We do, however, continue to upload coverage in post-submit so we still have somewhat nice facade to see and drill down istio coverage. See https://codecov.io/gh/istio/istio.

Can I see the historical run of these codecov jobs?

Yes. See https://testgrid.k8s.io/istio-presubmits#circleci-codecov-tests

Known Issues

Flaky Tests

The codecov job fails if any of the test fails during the first two steps. Therefore, the job can be as flaky as the circle (unit) test job. However, some package level retry had been added to the codecov job, so it should be somewhat more resilient to flakes. If your codecov job fails due to a test flake, retry it.

Flaky Coverage

There are some level of trial-and-error in istio test implementation. As a result, some error handing code may or may not be exercised due to timing or other factors. As a result, the computed coverage may vary across different runs. The best way is to add specific unit tests to make sure all error cases are specifically covered. If that can not be done easily, you can override the thresholds for those files in codecov.threshold as a workaround.

Refactoring

In case a go implementation file is refactored to pure interface, coverage will drop to 0%. The only workaround is to add the go file to codecov.threshold and set the threshold to 100.

Dev Environment

Writing Code

Pull Requests

Testing

Performance

Releases

Misc

Central Istiod

Security

Mixer

Pilot

Telemetry

Clone this wiki locally