Move computation of p-values out of `get_pairwise_comparisons()`? #750

nikosbosse · 2024-03-26T13:05:18Z

Currently, three functions exist that do something related to pairwise comparisons:

get_pairwise_comparisons() outputs a data.table with three different things:
- pairwise mean score ratios between models
- relative skill scores (geometric mean of the mean score ratios)
- pairwise p-values
add_relative_skill() adds relative skill scores to an existing scores object (calling get_pairwise_comparisons()`)
plot_pairwise_comparisons() visualises either mean score ratios or the p-values.

Should the calculation of p-values and mean score ratios/relative skill scores be done by the same function?

Pro:

both mean score ratios and computation of p-values require a similar mechanic in which two models are compared against each other
the function is currently called get_pairwise_comparisons() so it makes sense to have those two things together

Contra:

it might be cleaner to separate the code a bit. Both get_pairwise_comparisons() and plot_pairwise_comparisons() currently have code for both things.
the workflows feel a bit different. The mean score ratios feel a bit more like "visualisation of performance" and the p-values feel more like "rigorous statistical testing".

In terms of currently suggested workflows we have the following:

For getting relative skill scores, you call as_forecast(data) |> score() |> add_relative_skill().

For visualising mean score ratios, you call

pairwise <- example_quantile |>
  as_forecast() |>
  score() |>
  get_pairwise_comparisons() 

plot_pairwise_comparisons(pairwise)

For visualising p-values, you call

plot_pairwise_comparisons(pairwise, type = "pval")

We previously even had a nice plot that showed both p-values and mean score ratios in a single plot (using the upper and lower triangle), but that broke and we ditched a while ago.

Options:

leave everything as is for now.
remove computation of p-values for now. Maybe rename get_pairwise_comparisons() to get_score_ratios(). Re-introduce functionality later.
do a rewrite with two different workflows before the next CRAN release of version 2.0.0
other

The text was updated successfully, but these errors were encountered:

nikosbosse · 2024-03-26T13:06:11Z

@sbfnk @seabbs
@nickreich @elray1 maybe you have thoughts or preferences as well?

sbfnk · 2024-03-27T12:06:49Z

Could get_pairwise_comparisons() have a metric (or the like) option which could be mean_score_ratio (default) or p_value? The plot function could then plot whichever is there.

nikosbosse · 2024-03-27T13:13:27Z

Could get_pairwise_comparisons() have a metric (or the like) option which could be mean_score_ratio (default) or p_value? The plot function could then plot whichever is there.

It would get two metric arguments then :). My intuition is to prefer the status quo over that proposal for the following reasons:

we'd have to introduce and name an additional argument (or rather, the argument that decides what to do would be moved from plot_pairwise_comparisons() to get_pairwise_comparisons(). To me, the status quo feels a bit simpler.
we would still have the same code complexity issue with two functions covering two slightly different use cases.

seabbs · 2024-03-28T21:09:41Z

I think my preference is 1 (i.e do nothing)

nikosbosse · 2024-04-03T11:52:05Z

ok. Moving this to a later release then.

nikosbosse mentioned this issue Mar 26, 2024

Issue #750 - Introduce a forecast_type argument in validate_forecast() #751

Merged

9 tasks

nikosbosse added this to the scoringutils-2.x milestone Apr 3, 2024

nikosbosse added the question Further information is requested label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move computation of p-values out of `get_pairwise_comparisons()`? #750

Move computation of p-values out of `get_pairwise_comparisons()`? #750

nikosbosse commented Mar 26, 2024

nikosbosse commented Mar 26, 2024

sbfnk commented Mar 27, 2024

nikosbosse commented Mar 27, 2024

seabbs commented Mar 28, 2024

nikosbosse commented Apr 3, 2024

Move computation of p-values out of get_pairwise_comparisons()? #750

Move computation of p-values out of get_pairwise_comparisons()? #750

Comments

nikosbosse commented Mar 26, 2024

nikosbosse commented Mar 26, 2024

sbfnk commented Mar 27, 2024

nikosbosse commented Mar 27, 2024

seabbs commented Mar 28, 2024

nikosbosse commented Apr 3, 2024

Move computation of p-values out of `get_pairwise_comparisons()`? #750

Move computation of p-values out of `get_pairwise_comparisons()`? #750