Add script for computing plain pass/total scores over time #156

foolip · 2023-02-21T09:17:33Z

No description provided.

foolip · 2023-02-21T09:18:24Z

This is what I was playing with when I noticed we've passed 50,000 tests, and took a look at the growth over time:
https://mastodon.nu/@foolip/109879573138978166
https://mastodon.nu/@foolip/109897803628119305

foolip · 2023-02-21T09:28:53Z

A few observations after tinkering with this:

The big picture is visually very similar regardless of scoring method. To me that's an argument for the simplest possible "binary" approach, which takes out the nuances of harness status and subtests.
It seems important to fix the "there are periods of time, mostly mid-late 2018, where we ran both Safari 11.1 and 12.1, and the results are massively different" TODO and that might take out some of the noise. It would be better to have a clean series of runs per browser, skipping missing results per browser, at least in graphs like these that don't "join" results between browsers.
It would be good to make use of the manifest as the source of truth for which tests exist.

Add script for computing plain pass/total scores over time

f1f3177

Provide feedback