Interpretable ML @ Avanade ITS

EmTech V-Team - Explainable AI

Nema Sobhani
IT Analytics, Avanade

Objective

Explainability/interpretability overview and demonstration of ML tools in the Azure stack used at Avanade.

Agenda

I. Background

Why do we need Interpretable ML?
SHAP!
MSFT's Interpretability Offerings

II. In Action

Dummy Example
Avanade Interpretability VM

III. Other Approaches

Different Methods
Competing offerings

IV. Wrap-up

Questions
Sources/Suggested Reading
Packages

I. Background

Do to our unique relationship with Microsoft, we have been given direct access to product owners for Microsoft's cutting edge machine learning, interpretability, and explainability tools including interpret-ml, interpret-community, and the azureml sdk (May Hu, Mehrnoosh Sameki, Ilya Matiach).

Why do we need Intepretable ML?

"The goal of science is to gain knowledge, but many problems are solved with big datasets and black box machine learning models. The model itself becomes the source of knowledge instead of the data. Interpretability makes it possible to extract this additional knowledge captured by the model."

- Christoph Molnar, ‘Interpretable Machine Learning’

Applications

Financial Services/Banking (fraud detection)
Marketing (user engagement)
Healthcare (individualized medicine and tracking)
Epidemiology (disease outbreak modeling)

Benefits

Validation of domain knowledge
Provides actionable evidence
Guides data practices and feedback

SHAP!

SHapley Additive exPlanations

Gives both globally and locally accurate and consistent feature importance values derived from individual contributions (drawn from Lloyd Shapley's work in combinatorial game theory).

Ideal for use with opaque models (boosted tree, kernel-based, NN, etc).

MSFT's Interpretability Offerings

Microsoft addressed the need for a unified API that makes it easy to get model explanation/feature importances based on various model types, built in to their machine learning platform.

From the SDK
- pip install --upgrade azureml-sdk[explain,interpret,notebooks]
Only interpretability package
- pip install interpret-community

Using the TabularExplainer object, the model type is detected and the appropriate SHAP explainer is selected to generate feature importances.

Original Model	Invoked Explainer
Tree-based models	SHAP TreeExplainer
Deep Neural Network models	SHAP DeepExplainer
Linear models	SHAP LinearExplainer
None of the above	SHAP KernelExplainer

This package also supports a Mimic Explainer (Global Surrogate) and a Permutation Feature Importance Explainer (PFI), both of which are model-agnostic and will be covered later.

II. In Action

Dummy Example

Scikit-Learn Breast Cancer Binary Classification

Pip installations:

pip install numpy, pandas, sklearn, lightgbm, interpret-community[visualization]

import numpy as np
import pandas as pd
import sklearn.datasets as datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
import lightgbm as lgbm

# Load and partition data
data = datasets.load_breast_cancer()

X = data.data
y = data.target # 0 = malignant, 1 = benign
feature_names = data.feature_names.tolist()
classes = data.target_names.tolist()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
X_train, X_valid, y_train, y_valid = train_test_split(X_train, y_train, test_size=0.25, random_state=42, stratify=y_train)

# Model training
clf = lgbm.LGBMClassifier()

clf.fit(
    X=X_train,
    y=y_train,
    eval_set=[(X_valid, y_valid)],
    eval_metric='auc',
    feature_name=feature_names,
    verbose=25
)

y_pred = clf.predict(X_test)

print('\nConfusion Matrix: \n', confusion_matrix(y_test, y_pred))
print('\nClassification Report: \n', classification_report(y_test, y_pred))

[25]	valid_0's auc: 0.989519	valid_0's binary_logloss: 0.145683
[50]	valid_0's auc: 0.987226	valid_0's binary_logloss: 0.136426
[75]	valid_0's auc: 0.986243	valid_0's binary_logloss: 0.172291
[100]	valid_0's auc: 0.989846	valid_0's binary_logloss: 0.189166

Confusion Matrix: 
 [[40  2]
 [ 3 69]]

Classification Report: 
               precision    recall  f1-score   support

           0       0.93      0.95      0.94        42
           1       0.97      0.96      0.97        72

    accuracy                           0.96       114
   macro avg       0.95      0.96      0.95       114
weighted avg       0.96      0.96      0.96       114

# Feature Importance (SHAP)
from interpret_community import TabularExplainer

explainer = TabularExplainer(clf, initialization_examples=X_train, features=feature_names, classes=classes)

# Global Feature Importances
global_explanation = explainer.explain_global(X_train)
display(pd.DataFrame.from_dict(global_explanation.get_feature_importance_dict(), orient='index', columns=['SHAP Value']).head(10))

	SHAP Value
worst area	2.341634
worst concave points	1.823587
worst perimeter	1.361179
worst texture	1.167765
area error	0.810253
mean concave points	0.637934
worst smoothness	0.334095
mean texture	0.311042
worst radius	0.200612
mean smoothness	0.200187

# Local Feature Importances (for predicting benign class)
shap = global_explanation.local_importance_values[1]

df_shap = pd.DataFrame(shap, columns=feature_names)
display(df_shap.head())

	mean radius	mean texture	mean perimeter	mean area	mean smoothness	mean compactness	mean concavity	mean concave points	mean symmetry	mean fractal dimension	...	worst radius	worst texture	worst perimeter	worst area	worst smoothness	worst compactness	worst concavity	worst concave points	worst symmetry	worst fractal dimension
0	-0.021539	0.321744	-0.021727	-0.001728	-0.782009	0.013229	-0.071580	-0.124397	-0.043490	-0.034941	...	0.188589	1.557899	1.157621	2.275592	-0.408454	-0.001883	0.029665	1.730362	0.047198	-0.109017
1	-0.126181	-0.371998	-0.073260	-0.001627	-0.214240	-0.027172	-0.291059	-0.304990	-0.199762	-0.004075	...	-0.041869	-2.656456	-2.995181	-0.203860	-0.740862	0.005230	-0.307423	-0.285139	-0.441779	-0.004399
2	0.003145	0.339770	-0.035467	-0.001743	-0.745712	0.039464	0.062067	0.488236	-0.047262	-0.037220	...	0.035996	1.199373	1.354136	1.989123	-0.379708	0.062644	0.059645	1.566148	-0.050104	0.031548
3	0.067903	0.657175	0.012231	0.003307	0.123375	-0.365281	0.033967	0.533232	0.012120	0.062579	...	0.079415	2.114178	0.219407	1.221378	0.153687	0.045893	0.045342	1.441597	0.435639	-0.054577
4	-0.055729	-0.170627	-0.046742	-0.002422	-0.724170	0.009585	-0.229870	-0.462293	-0.172071	-0.042171	...	-0.077372	-2.193885	1.002135	2.284974	-0.459759	-0.027261	-0.353700	-5.373368	-0.290265	-0.026400

5 rows × 30 columns

# Visualization
from interpret_community.widget import ExplanationDashboard

display(ExplanationDashboard(global_explanation, clf, datasetX=X_train, trueY=y_train))

Avanade Interpretability VM (Demo)

III. Other Approaches

Different Methods

Local Interpretable Model-agnostic Explanations (LIME)

Explainable surrogate models are trained on the predictions of the opaque model, therefore allowing local, interpretable explanations. No guarantee to be globally relevant.

Global Surrogates Models (Mimic Explainers)

Same as LIME, but applied to global scale. Must be an interpretable model (tree or linear) that trains on the original data with the addition of the predicted label of the opaque model.

Permutation Feature Importance (PFI)

Shuffles dataset, feature by feature and measures effect on performance metric. Larger changes are attributable as more important features.

Diverse Counterfactual Explanations (DiCE)

Uses feature perturbations to give actionable outcomes on requirements to shift between classes.

ie. If credit score was > 700, user X would likely move into the "Loan Approved" classification.

Competing offerings

AWS Sagemaker Debugger just recently started utilizing the shap package microsoft has integrated into azure ml, contrasting the maturity of Microsoft's early investment in explainable AI.

Oracle's "Skater", is a python package that supports local interpretation using LIME and global interpretation using scalable bayesian rule lists and tree surrogates. The documentation is rather sparse and there doesn't seem to be any momentum to expand to other methods.

Scikit-learn's built-in feature importances provide some value for simple models, but lack the depth and versatility of msft's interpret-community.

Explain like I’m 5 (ELI5) uses LIME and PFI on opaque models and offers specialized support for text classifiers. Does not offer any visual utilities or shap.

The webapp, ml-interpret, is an online-only platform where a dataset may be uploaded and a model selected to explain outcomes of opaque models, but there is practically no customizability and the user is size-restricted.

IV. Wrap-up

Questions

Sources/Suggested Reading

Molnar, Christoph. "Interpretable machine learning. A Guide for Making Black Box Models Explainable", 2019
Lundberg SM et al. "A Unified Approach to Interpreting Model Predictions", NIPS 2017
Shapley, LS. "Notes on the n-Person Game -- II: The Value of an n-Person Game", 1951

Packages

https://github.com/slundberg/shap
https://github.com/interpretml/interpret-community
https://github.com/interpretml/DiCE
https://github.com/TeamHG-Memex/eli5
https://github.com/oracle/Skater http://ml-interpret.herokuapp.com/

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
images		images
presentation		presentation
Interpretable ML at Avanade ITS.ipynb		Interpretable ML at Avanade ITS.ipynb
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

presentation

presentation

Interpretable ML at Avanade ITS.ipynb

Interpretable ML at Avanade ITS.ipynb

readme.md

readme.md

requirements.txt

requirements.txt

Repository files navigation

Interpretable ML @ Avanade ITS

Objective

Agenda

I. Background

Why do we need Intepretable ML?

SHAP!

MSFT's Interpretability Offerings

II. In Action

Dummy Example

Avanade Interpretability VM (Demo)

III. Other Approaches

Different Methods

Local Interpretable Model-agnostic Explanations (LIME)

Global Surrogates Models (Mimic Explainers)

Permutation Feature Importance (PFI)

Diverse Counterfactual Explanations (DiCE)

Competing offerings

IV. Wrap-up

Questions

Sources/Suggested Reading

Packages

About

Releases

Packages

Languages

nemasobhani/InterpretableML

Folders and files

Latest commit

History

Repository files navigation

Interpretable ML @ Avanade ITS

Objective

Agenda

I. Background

Why do we need Intepretable ML?

SHAP!

MSFT's Interpretability Offerings

II. In Action

Dummy Example

Avanade Interpretability VM (Demo)

III. Other Approaches

Different Methods

Local Interpretable Model-agnostic Explanations (LIME)

Global Surrogates Models (Mimic Explainers)

Permutation Feature Importance (PFI)

Diverse Counterfactual Explanations (DiCE)

Competing offerings

IV. Wrap-up

Questions

Sources/Suggested Reading

Packages

About

Resources

Stars

Watchers

Forks

Languages