Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an API for listing a series of known good runs #3508

Open
foolip opened this issue Sep 15, 2023 · 3 comments
Open

Provide an API for listing a series of known good runs #3508

foolip opened this issue Sep 15, 2023 · 3 comments

Comments

@foolip
Copy link
Member

foolip commented Sep 15, 2023

Analysis done outside of wpt.fyi using runs from /api/runs usually ends up needing to do some filtering of the returned runs to get sensible results. For example, https://github.com/web-platform-tests/results-analysis/blob/main/bad-ranges.js filters out ranges of known bad ranges.

In code that I've written in the past, I've also needed to handle time periods where the browser version was flip-flopping between N and N+1, either because two configurations were running at the same time, or because there were regressions leading to pinning of the browser version, and later unpinning.

Here are the guarantees that I think wpt.fyi could usefully provide for a series of runs for a single browser:

  • Known bad runs are excluded
  • The start time of the run is monotonically increasing
  • The browser version is monotonically increasing (given flip-flopping, ideally it would pick the breakpoint that gives the shortest gap between runs, or perhaps the most number of total runs)
  • The OS version is monotonically increasing (same considerations as above)
  • The browser channel only "increases" (in case we switch the default Chrome config from dev to canary)

This does not necessarily need to be a single API or new parameter for /api/runs, it might be several.

It would furthermore be useful to be able to get multiple series of runs together, and aligned being respected.

This is always the first problem I have to solve when doing any kind of time series analysis, so it would be great to solve it in one place 😄

@foolip
Copy link
Member Author

foolip commented Sep 15, 2023

Just came across web-platform-tests/results-analysis#186, which is another difficulty with aligned series of runs which could possibly be handled in a wpt.fyi API instead. I think I'll add one more guarantee that I think would simplify the problem of hash-aligned but date-misaligned runs:

  • The WPT commit date is monotonically increasing (we don't have this information in wpt.fyi now)

This way, wether aligned is used or multiple series are fetched and aligned by the client, there are a few simple options that are less heuristic-y than web-platform-tests/results-analysis#186:

  • Just use the commit date
  • Use the earliest start time of the runs
  • Use the latest start time of the runs

cc @gsnedders

@gsnedders
Copy link
Member

  • The OS version is monotonically increasing (same considerations as above)

This is not necessarily something we want to strictly guarantee; it should definitely broadly be true, but we have previously reverted OS upgrades.

Plus one could imagine running roughly the same configuration in different CI systems (as we previously had with the Bocoup-maintained Buildbot and Azure Pipelines for macOS), at different frequencies, which may alter selection.

@foolip
Copy link
Member Author

foolip commented Nov 1, 2023

  • The OS version is monotonically increasing (same considerations as above)

This is not necessarily something we want to strictly guarantee; it should definitely broadly be true, but we have previously reverted OS upgrades.

Plus one could imagine running roughly the same configuration in different CI systems (as we previously had with the Bocoup-maintained Buildbot and Azure Pipelines for macOS), at different frequencies, which may alter selection.

These are the cases I had in mind that an API should handle. The version bump should happen once without flip-flopping, by filtering out some runs. The logic for which runs to filter out is an interesting question without an obvious best answer. Perhaps multiple strategies are valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants