Interop can be scored for partially aligned runs #216

gsnedders · 2024-03-14T15:55:23Z

There's nothing technically stopping us from scoring Interop for browsers when we don't have aligned runs for all browsers on a given day.

This would lessen the impact of a browser not having results on wpt.fyi for a prolonged period (c.f. web-platform-tests/wpt#44366), though it does make the cross-browser Interop score hard to update.

We have several options here:

Find the aligned run with the largest number of browsers on a given day,
If we don't have an aligned run with all browsers, fall back to the latest run (if any) of each browser on that day,
(The status-quo:) Don't update Interop at all when we don't have an aligned run with all browsers.

The biggest risk here is that the Interop dashboard ends up showing scores based on different sets of tests for different browsers (depending on how many tests have changed in the time period since the last fully aligned run), but the current status is browsers are getting no reward for shipping features and bug fixes which progress Interop.

foolip · 2024-03-14T22:11:36Z

@jgraham IIRC your Rust rewrite isn't limited to aligned runs. How did you handle the problem there?

I think something like this would work:

Fetch all runs for all browsers, not just aligned runs.
Find aligned runs and score those, similar to today.
For any dates that did not have aligned runs, score each browser individually, and record no interop score at all

The frontend then shows the data we have. The latest score of each type is used, which might not come from an aligned run.

We could do this retroactively.

foolip · 2024-03-14T22:13:29Z

What I've described is @gsnedders's option 2, I think. If option 1 turns out to be easy when "find aligned runs" is implemented locally and not on the server, that would work too.

jgraham · 2024-03-15T09:21:26Z

https://github.com/jgraham/interop-results/tree/main/2024/results/revisions just has results per revision for every browser (in the product set we care about) for that revision. They are generated once i.e. it doesn't rescore ever past revision if the metadata changes. No interop score is calculated.

https://github.com/jgraham/interop-results/tree/main/2024/latest/aligned has both "current" (i.e. with rescoring) and "historic" versions of aligned runs. The -daily variants only have the last aligned run in a given day, which is the same as we have today.

I was imagining the frontend allowing two things: a toggle between "current" and "historic" mode, which affects the graph, and the ability to select a specific SHA and see the (historic) scores for that run (but without an "Interop" score unless it happens to be an aligned run).

foolip · 2024-03-15T11:26:36Z

I see. Do you want to switch this whole code base to Rust in the near term, or should we try to fix the problem in the current JS code?

jgraham · 2024-03-15T11:41:05Z

In theory the Rust-generated CSV files should be usable as a drop-in replacement for the current data, independent of new features. So I'd propose starting with that. There is one bug I know about in the rust code (we're not correctly recording the metadata revision used to generate each historic entry), but since that's not in the current data I don't think it would affect that transition.

Obviously we should also validate that we're really getting the same results from both systems (and I think @gsnedders would have preferred a different implementation based on https://github.com/gsnedders/results-analysis/tree/rust, but I hope that's not a blocker).

gsnedders mentioned this issue Mar 14, 2024

Interop stable dashboard not updating web-platform-tests/interop#647

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interop can be scored for partially aligned runs #216

Interop can be scored for partially aligned runs #216

gsnedders commented Mar 14, 2024

foolip commented Mar 14, 2024

foolip commented Mar 14, 2024

jgraham commented Mar 15, 2024

foolip commented Mar 15, 2024

jgraham commented Mar 15, 2024

Interop can be scored for partially aligned runs #216

Interop can be scored for partially aligned runs #216

Comments

gsnedders commented Mar 14, 2024

foolip commented Mar 14, 2024

foolip commented Mar 14, 2024

jgraham commented Mar 15, 2024

foolip commented Mar 15, 2024

jgraham commented Mar 15, 2024