Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding PSI for continious data #329

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hakimelakhrass
Copy link
Contributor

@hakimelakhrass hakimelakhrass commented Oct 14, 2023

added Population Stability Index (PSI) for continuous data.

I used 0.25 as the alerting threshold citing this -> https://www.risk.net/journal-of-risk-model-validation/7725371/statistical-properties-of-the-population-stability-index#:~:text=In%20practice%2C%20the%20following%20%E2%80%9Crule,or%20type%20II%20error%20rates.

Used to Freedman-Diaconis Rule determine bin size

also updated the comments in drift/methods to say drift metric, instead of performance metrics

I recommend looking at the calculations and math closely to see whether it makes sense.

I'll add it for categorical next

@hakimelakhrass hakimelakhrass added the enhancement New feature or request label Oct 14, 2023
@hakimelakhrass hakimelakhrass self-assigned this Oct 14, 2023
@codecov
Copy link

codecov bot commented Oct 14, 2023

Codecov Report

Attention: Patch coverage is 20.45455% with 35 lines in your changes are missing coverage. Please review.

Project coverage is 84.69%. Comparing base (dd20ef7) to head (c58c7f5).
Report is 75 commits behind head on main.

Files Patch % Lines
nannyml/drift/univariate/methods.py 20.45% 35 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #329      +/-   ##
==========================================
+ Coverage   83.40%   84.69%   +1.28%     
==========================================
  Files         100      100              
  Lines        7245     8931    +1686     
  Branches     1275     1730     +455     
==========================================
+ Hits         6043     7564    +1521     
- Misses        905     1016     +111     
- Partials      297      351      +54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nikml
Copy link
Contributor

nikml commented Oct 16, 2023

Small Comment:

You may want to consider numpy.histogram_bin_edges instead of manually implementing Freedman-Diaconis Rule. It has an option to use FD specifically as well, but maybe use doane like in JS. I remember we have looked into which option was the best back then.

We are also using it for JS here
https://github.com/NannyML/nannyml/blob/main/nannyml/drift/univariate/methods.py#L683

@nnansters
Copy link
Contributor

Maybe also add some numerical tests in tests/drift/test_univariate_drift_methods.py?

This helps us verify that the behavior is correct now and doesn't change over time!

Copy link

stale bot commented Dec 29, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 29, 2023
@stale stale bot closed this Jan 5, 2024
@nikml nikml reopened this Jan 8, 2024
@stale stale bot removed the stale label Jan 8, 2024
Copy link

stale bot commented Mar 8, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 8, 2024
@stale stale bot closed this Mar 15, 2024
@nikml nikml reopened this Mar 15, 2024
@stale stale bot removed the stale label Mar 15, 2024
Copy link

stale bot commented May 15, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 15, 2024
@nnansters nnansters removed the stale label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants