Grouped Permutation Importance

Understanding the fundamentals of a decision-making process is, for most purposes, an essential step in the field of machine learning. In this context, the analysis of predefined groups of features can provide important indications for comprehending and improving the prediction. This repository extend the univariate permutation importance to a grouped version for evaluating the influence of whole feature subsets in a machine learning model. This is done by a slight modification of the permutation importance of scikit-learn.

Install via pip

pip install git+https://github.com/lucasplagwitz/grouped_permutation_importance

from grouped_permutation_importance import grouped_permutation_importance

data = load_breast_cancer()
feature_names = data["feature_names"].tolist()
X, y = data["data"], data["target"]

idxs = []
columns = ["mean", "error", "worst"]
for key in columns:
    idxs.append([x for (x, y) in enumerate(feature_names) if key in y])

cv = RepeatedStratifiedKFold()
pipe = Pipeline([("MinMax", MinMaxScaler()),  ("SVC", SVC())])


r = grouped_permutation_importance(pipe, X, y, idxs=idxs, n_repeats=50, random_state=0, 
                                   scoring="balanced_accuracy", n_jobs=5, cv=cv, 
                                   perm_set="test")

Simulation

In the file "examples/make_class.py" a small simulation is shown to verify correctness. Based on scikit-learns make_classification method, different informative subsets are analyzed.

Model interpretation

The file "examples/brain_atlas.py" demonstrates a neuroimaging example for rating brain regions depending on the target variable (age, CDR, biological sex).

Citing

If you use the Grouped Permutation Importance in a scientific publication, we would appreciate citations to the following paper:

Lucas Plagwitz, Alexander Brenner, Michael Fujarski, and Julian Varghese. Supporting AI-Explainability by Analyzing Feature Subsets in a Machine Learning Model.
Studies in Health Technology and Informatics, Volume 294: Challenges of Trustable AI and Added-Value on Health. doi:10.3233/SHTI220406

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
demo		demo
examples		examples
grouped_permutation_importance		grouped_permutation_importance
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

examples

examples