predict anticipated performance based on metadata variables #117

tcoffman-strive · 2023-05-30T15:40:28Z

#112 discusses stratifying metrics computation based on metadata variables. Step 1 is just stratifying and reporting, Step 2 (mentioned in #112) starts to analyze relationships between metadata and expected results. This issue is intended to directly estimate predicted metric values (whether P, R, TP, FP, etc.) from the metadata. The idea is to, based on past results, use the metadata of a datum to estimate accuracy and/or confidence of the model on that specific datum (as an improvement over blanket averages).

You could start with "just" regression. Or if several metadata values are available, some dimensionality reduction technique (PCA or other) may be useful. The approach could use buckets/cohorts as envisioned in #112, or, it should be possible to formulate some expectation maximization or other optimization to directly estimate the model parameters (the parameters of the regression model predicting anticipated metric values) without building explicit cohorts.

ekorman added the backlog label Jun 13, 2023

ekorman added feature request and removed backlog labels Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict anticipated performance based on metadata variables #117

predict anticipated performance based on metadata variables #117

tcoffman-strive commented May 30, 2023

predict anticipated performance based on metadata variables #117

predict anticipated performance based on metadata variables #117

Comments

tcoffman-strive commented May 30, 2023