Skip to content

Nicerova7/statistical_iv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Statistical IV

Our J-Divergence test is under the next null hypothesis

H0: The predictive power of the variable is not significant.

The null hypothesis is tested using a two-tailed distribution, and this should be taken into consideration when interpreting the p-value.

Explanation

Optimize your machine learning models with 'Statistical-IV'. Perform automated feature selection based on statistics and customize error control.

  1. Import package

    from statistical_iv import api
  2. Provide a DataFrame as Input:

    • Supply a DataFrame df containing your data for IV calculation.
  3. Specify Predictor Variables:

    • Prived a list of predictor variable names (variables_names) to analyze.
  4. Define the Target Variable:

    • Specify the name of the target variable (var_y) in your DataFrame.
  5. Indicate Variable Types:

    • Define the type of your predictor variables as 'categorical' or 'numerical' using the type_vars parameter.
  6. Optional: Set Maximum Bins:

    • Adjust the maximum number of bins for discretization (optional) using the max_bins parameter.
  7. Call the statistical_iv Function:

    • Calculate Statistical IV information by calling the statistical_iv function from api with the specified parameters (That is used for OptimalBinning package).
    result_df = api.statistical_iv(df, variables_names, var_y, type_vars, max_bins)

Example Result:

Output Example

Full Paper:

For a comprehensive exploration of the topic, we recommend perusing the contents of the article available at this link.

About

Statistical_IV: J-Divergence Hypothesis Test for the Information Value (IV)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages