Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict anticipated performance based on metadata variables #117

Open
tcoffman-strive opened this issue May 30, 2023 · 0 comments
Open

predict anticipated performance based on metadata variables #117

tcoffman-strive opened this issue May 30, 2023 · 0 comments

Comments

@tcoffman-strive
Copy link

#112 discusses stratifying metrics computation based on metadata variables. Step 1 is just stratifying and reporting, Step 2 (mentioned in #112) starts to analyze relationships between metadata and expected results. This issue is intended to directly estimate predicted metric values (whether P, R, TP, FP, etc.) from the metadata. The idea is to, based on past results, use the metadata of a datum to estimate accuracy and/or confidence of the model on that specific datum (as an improvement over blanket averages).

You could start with "just" regression. Or if several metadata values are available, some dimensionality reduction technique (PCA or other) may be useful. The approach could use buckets/cohorts as envisioned in #112, or, it should be possible to formulate some expectation maximization or other optimization to directly estimate the model parameters (the parameters of the regression model predicting anticipated metric values) without building explicit cohorts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants