shap

A game theoretic approach to explain the output of any machine learning model.

https://github.com/shap/shap

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, nature.com
  • Committers with academic emails
    28 of 270 committers (10.4%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.5%) to scientific vocabulary

Keywords

deep-learning explainability gradient-boosting interpretability machine-learning shap shapley

Keywords from Contributors

distribution parallel gbrt gbm gbdt closember alignment flexible data-mining autograding
Last synced: 6 months ago · JSON representation

Repository

A game theoretic approach to explain the output of any machine learning model.

Basic Info
  • Host: GitHub
  • Owner: shap
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage: https://shap.readthedocs.io
  • Size: 280 MB
Statistics
  • Stars: 24,341
  • Watchers: 248
  • Forks: 3,421
  • Open Issues: 690
  • Releases: 59
Topics
deep-learning explainability gradient-boosting interpretability machine-learning shap shapley
Created about 9 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License

README.md


PyPI Conda License Tests Binder Documentation Status Downloads PyPI pyversions

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions (see papers for details and citations).

Install

SHAP can be installed from either PyPI or conda-forge:

pip install shap
or
conda install -c conda-forge shap

Tree ensemble example (XGBoost/LightGBM/CatBoost/scikit-learn/pyspark models)

While SHAP can explain the output of any machine learning model, we have developed a high-speed exact algorithm for tree ensemble methods (see our Nature MI paper). Fast C++ implementations are supported for XGBoost, LightGBM, CatBoost, scikit-learn and pyspark tree models:

```python import xgboost import shap

train an XGBoost model

X, y = shap.datasets.california() model = xgboost.XGBRegressor().fit(X, y)

explain the model's predictions using SHAP

(same syntax works for LightGBM, CatBoost, scikit-learn, transformers, Spark, etc.)

explainer = shap.Explainer(model) shap_values = explainer(X)

visualize the first prediction's explanation

shap.plots.waterfall(shap_values[0]) ```

The above explanation shows features each contributing to push the model output from the base value (the average model output over the training dataset we passed) to the model output. Features pushing the prediction higher are shown in red, those pushing the prediction lower are in blue. Another way to visualize the same explanation is to use a force plot (these are introduced in our Nature BME paper):

```python

visualize the first prediction's explanation with a force plot

shap.plots.force(shap_values[0]) ```

If we take many force plot explanations such as the one shown above, rotate them 90 degrees, and then stack them horizontally, we can see explanations for an entire dataset (in the notebook this plot is interactive):

```python

visualize all the training set predictions

shap.plots.force(shap_values[:500]) ```

To understand how a single feature effects the output of the model we can plot the SHAP value of that feature vs. the value of the feature for all the examples in a dataset. Since SHAP values represent a feature's responsibility for a change in the model output, the plot below represents the change in predicted house price as the latitude changes. Vertical dispersion at a single value of latitude represents interaction effects with other features. To help reveal these interactions we can color by another feature. If we pass the whole explanation tensor to the color argument the scatter plot will pick the best feature to color by. In this case it picks longitude.

```python

create a dependence scatter plot to show the effect of a single feature across the whole dataset

shap.plots.scatter(shapvalues[:, "Latitude"], color=shapvalues) ```

To get an overview of which features are most important for a model we can plot the SHAP values of every feature for every sample. The plot below sorts features by the sum of SHAP value magnitudes over all samples, and uses SHAP values to show the distribution of the impacts each feature has on the model output. The color represents the feature value (red high, blue low). This reveals for example that higher median incomes increases the predicted home price.

```python

summarize the effects of all the features

shap.plots.beeswarm(shap_values) ```

We can also just take the mean absolute value of the SHAP values for each feature to get a standard bar plot (produces stacked bars for multi-class outputs):

python shap.plots.bar(shap_values)

Natural language example (transformers)

SHAP has specific support for natural language models like those in the Hugging Face transformers library. By adding coalitional rules to traditional Shapley values we can form games that explain large modern NLP model using very few function evaluations. Using this functionality is as simple as passing a supported transformers pipeline to SHAP:

```python import transformers import shap

load a transformers pipeline model

model = transformers.pipeline('sentiment-analysis', returnallscores=True)

explain the model on two sample inputs

explainer = shap.Explainer(model) shap_values = explainer(["What a great movie! ...if you have no taste."])

visualize the first prediction's explanation for the POSITIVE output class

shap.plots.text(shap_values[0, :, "POSITIVE"]) ```

Deep learning example with DeepExplainer (TensorFlow/Keras models)

Deep SHAP is a high-speed approximation algorithm for SHAP values in deep learning models that builds on a connection with DeepLIFT described in the SHAP NIPS paper. The implementation here differs from the original DeepLIFT by using a distribution of background samples instead of a single reference value, and using Shapley equations to linearize components such as max, softmax, products, divisions, etc. Note that some of these enhancements have also been since integrated into DeepLIFT. TensorFlow models and Keras models using the TensorFlow backend are supported (there is also preliminary support for PyTorch):

```python

...include code from https://github.com/keras-team/keras/blob/master/examples/demomnistconvnet.py

import shap import numpy as np

select a set of background examples to take an expectation over

background = xtrain[np.random.choice(xtrain.shape[0], 100, replace=False)]

explain predictions of the model on four images

e = shap.DeepExplainer(model, background)

...or pass tensors directly

e = shap.DeepExplainer((model.layers[0].input, model.layers[-1].output), background)

shapvalues = e.shapvalues(x_test[1:5])

plot the feature attributions

shap.imageplot(shapvalues, -x_test[1:5]) ```

The plot above explains ten outputs (digits 0-9) for four different images. Red pixels increase the model's output while blue pixels decrease the output. The input images are shown on the left, and as nearly transparent grayscale backings behind each of the explanations. The sum of the SHAP values equals the difference between the expected model output (averaged over the background dataset) and the current model output. Note that for the 'zero' image the blank middle is important, while for the 'four' image the lack of a connection on top makes it a four instead of a nine.

Deep learning example with GradientExplainer (TensorFlow/Keras/PyTorch models)

Expected gradients combines ideas from Integrated Gradients, SHAP, and SmoothGrad into a single expected value equation. This allows an entire dataset to be used as the background distribution (as opposed to a single reference value) and allows local smoothing. If we approximate the model with a linear function between each background data sample and the current input to be explained, and we assume the input features are independent then expected gradients will compute approximate SHAP values. In the example below we have explained how the 7th intermediate layer of the VGG16 ImageNet model impacts the output probabilities.

```python from keras.applications.vgg16 import VGG16 from keras.applications.vgg16 import preprocess_input import keras.backend as K import numpy as np import json import shap

load pre-trained model and choose two images to explain

model = VGG16(weights='imagenet', includetop=True) X,y = shap.datasets.imagenet50() toexplain = X[[39,41]]

load the ImageNet class names

url = "https://s3.amazonaws.com/deep-learning-models/image-models/imagenetclassindex.json" fname = shap.datasets.cache(url) with open(fname) as f: class_names = json.load(f)

explain how the input to the 7th layer of the model explains the top two classes

def map2layer(x, layer): feeddict = dict(zip([model.layers[0].input], [preprocessinput(x.copy())])) return K.getsession().run(model.layers[layer].input, feeddict) e = shap.GradientExplainer( (model.layers[7].input, model.layers[-1].output), map2layer(X, 7), localsmoothing=0 # std dev of smoothing noise ) shapvalues,indexes = e.shapvalues(map2layer(toexplain, 7), ranked_outputs=2)

get the names for the classes

indexnames = np.vectorize(lambda x: classnames[str(x)][1])(indexes)

plot the explanations

shap.imageplot(shapvalues, toexplain, indexnames) ```

Predictions for two input images are explained in the plot above. Red pixels represent positive SHAP values that increase the probability of the class, while blue pixels represent negative SHAP values the reduce the probability of the class. By using ranked_outputs=2 we explain only the two most likely classes for each input (this spares us from explaining all 1,000 classes).

Model agnostic example with KernelExplainer (explains any function)

Kernel SHAP uses a specially-weighted local linear regression to estimate SHAP values for any model. Below is a simple example for explaining a multi-class SVM on the classic iris dataset.

```python import sklearn import shap from sklearn.modelselection import traintest_split

print the JS visualization code to the notebook

shap.initjs()

train a SVM classifier

Xtrain,Xtest,Ytrain,Ytest = traintestsplit(*shap.datasets.iris(), testsize=0.2, randomstate=0) svm = sklearn.svm.SVC(kernel='rbf', probability=True) svm.fit(Xtrain, Ytrain)

use Kernel SHAP to explain test set predictions

explainer = shap.KernelExplainer(svm.predictproba, Xtrain, link="logit") shapvalues = explainer.shapvalues(X_test, nsamples=100)

plot the SHAP values for the Setosa output of the first instance

shap.forceplot(explainer.expectedvalue[0], shapvalues[0][0,:], Xtest.iloc[0,:], link="logit") ```

The above explanation shows four features each contributing to push the model output from the base value (the average model output over the training dataset we passed) towards zero. If there were any features pushing the class label higher they would be shown in red.

If we take many explanations such as the one shown above, rotate them 90 degrees, and then stack them horizontally, we can see explanations for an entire dataset. This is exactly what we do below for all the examples in the iris test set:

```python

plot the SHAP values for the Setosa output of all instances

shap.forceplot(explainer.expectedvalue[0], shapvalues[0], Xtest, link="logit") ```

SHAP Interaction Values

SHAP interaction values are a generalization of SHAP values to higher order interactions. Fast exact computation of pairwise interactions are implemented for tree models with shap.TreeExplainer(model).shap_interaction_values(X). This returns a matrix for every prediction, where the main effects are on the diagonal and the interaction effects are off-diagonal. These values often reveal interesting hidden relationships, such as how the increased risk of death peaks for men at age 60 (see the NHANES notebook for details):

Sample notebooks

The notebooks below demonstrate different use cases for SHAP. Look inside the notebooks directory of the repository if you want to try playing with the original notebooks yourself.

TreeExplainer

An implementation of Tree SHAP, a fast and exact algorithm to compute SHAP values for trees and ensembles of trees.

DeepExplainer

An implementation of Deep SHAP, a faster (but only approximate) algorithm to compute SHAP values for deep learning models that is based on connections between SHAP and the DeepLIFT algorithm.

GradientExplainer

An implementation of expected gradients to approximate SHAP values for deep learning models. It is based on connections between SHAP and the Integrated Gradients algorithm. GradientExplainer is slower than DeepExplainer and makes different approximation assumptions.

LinearExplainer

For a linear model with independent features we can analytically compute the exact SHAP values. We can also account for feature correlation if we are willing to estimate the feature covariance matrix. LinearExplainer supports both of these options.

KernelExplainer

An implementation of Kernel SHAP, a model agnostic method to estimate SHAP values for any model. Because it makes no assumptions about the model type, KernelExplainer is slower than the other model type specific algorithms.

  • Census income classification with scikit-learn - Using the standard adult census income dataset, this notebook trains a k-nearest neighbors classifier using scikit-learn and then explains predictions using shap.

  • ImageNet VGG16 Model with Keras - Explain the classic VGG16 convolutional neural network's predictions for an image. This works by applying the model agnostic Kernel SHAP method to a super-pixel segmented image.

  • Iris classification - A basic demonstration using the popular iris species dataset. It explains predictions from six different models in scikit-learn using shap.

Documentation notebooks

These notebooks comprehensively demonstrate how to use specific functions and objects.

Methods Unified by SHAP

  1. LIME: Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. "Why should i trust you?: Explaining the predictions of any classifier." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.

  2. Shapley sampling values: Strumbelj, Erik, and Igor Kononenko. "Explaining prediction models and individual predictions with feature contributions." Knowledge and information systems 41.3 (2014): 647-665.

  3. DeepLIFT: Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. "Learning important features through propagating activation differences." arXiv preprint arXiv:1704.02685 (2017).

  4. QII: Datta, Anupam, Shayak Sen, and Yair Zick. "Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems." Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 2016.

  5. Layer-wise relevance propagation: Bach, Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): e0130140.

  6. Shapley regression values: Lipovetsky, Stan, and Michael Conklin. "Analysis of regression in game theory approach." Applied Stochastic Models in Business and Industry 17.4 (2001): 319-330.

  7. Tree interpreter: Saabas, Ando. Interpreting random forests. http://blog.datadive.net/interpreting-random-forests/

Citations

The algorithms and visualizations used in this package came primarily out of research in Su-In Lee's lab at the University of Washington, and Microsoft Research. If you use SHAP in your research we would appreciate a citation to the appropriate paper(s):

Owner

  • Name: shap
  • Login: shap
  • Kind: organization

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 2,287
  • Total Committers: 270
  • Avg Commits per committer: 8.47
  • Development Distribution Score (DDS): 0.627
Past Year
  • Commits: 174
  • Committers: 39
  • Avg Commits per committer: 4.462
  • Development Distribution Score (DDS): 0.678
Top Committers
Name Email Commits
Scott Lundberg s****1@c****u 852
ryserrao s****5@g****m 215
connortann 7****n 199
Jeremy Goh 3****y 78
Tobias Pitters 3****e 72
dependabot[bot] 4****] 66
Vivek Chettiar v****r@g****m 57
Ilya Matiach i****t@m****m 35
Gabriel Tseng g****g@m****a 33
Rory Mitchell r****z@g****m 32
Quentin q****d@g****m 27
jsu27 j****u@g****m 21
pre-commit-ci[bot] 6****] 21
anusham1990 a****0 20
lrjball l****l@g****m 18
Scott Lundberg s****1@S****m 18
Floid Gilbert f****t@g****m 17
dsgibbons d****4@g****m 13
Jason Tam J****2@g****m 12
kodonnell k****l 10
Ryan Serrao r****o@m****m 10
Maggie Wu w****6@g****m 9
Jorge C. Leitao j****o@g****m 9
Zakaria Nacer 4****4 8
alexander a****v@g****m 8
KOLANICH K****H 8
Alexis Drakopoulos 3****s 7
xzzxzxxz z****1@g****m 7
Zakaria Nacer 4****r 7
Primož Godec p****9@g****m 7
and 240 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1,727
  • Total pull requests: 837
  • Average time to close issues: almost 3 years
  • Average time to close pull requests: 5 months
  • Total issue authors: 1,351
  • Total pull request authors: 140
  • Average comments per issue: 3.63
  • Average comments per pull request: 1.97
  • Merged pull requests: 561
  • Bot issues: 0
  • Bot pull requests: 141
Past Year
  • Issues: 118
  • Pull requests: 262
  • Average time to close issues: 13 days
  • Average time to close pull requests: 15 days
  • Issue authors: 85
  • Pull request authors: 41
  • Average comments per issue: 0.54
  • Average comments per pull request: 0.9
  • Merged pull requests: 163
  • Bot issues: 0
  • Bot pull requests: 62
Top Authors
Issue Authors
  • connortann (71)
  • CloseChoice (32)
  • anusham1990 (9)
  • juanramonua (7)
  • thatlittleboy (7)
  • ahmedabbas81 (6)
  • ghost (6)
  • jlevy44 (6)
  • vaishkiva (5)
  • DiliSR (5)
  • antonkulaga (5)
  • nickcorona (5)
  • PARODBE (4)
  • noxthot (4)
  • maggiewu19 (4)
Pull Request Authors
  • connortann (217)
  • CloseChoice (155)
  • dependabot[bot] (106)
  • thatlittleboy (41)
  • pre-commit-ci[bot] (35)
  • znacer (14)
  • tylerjereddy (7)
  • Ja-Tink (7)
  • LakshmanKishore (6)
  • bewygs (6)
  • imatiach-msft (6)
  • owenlamont (4)
  • davidefiocco (4)
  • fabianliebig (4)
  • noxthot (4)
Top Labels
Issue Labels
stale (952) bug (235) todo (105) enhancement (103) awaiting feedback (48) visualization (31) question (30) deep explainer (22) documentation (20) ci (20) help wanted (20) good first issue (14) dependencies (11) paper (7) javascript (5) regression (4) duplicate (3) wontfix (3) skip-changelog (2) good read (2) BREAKING (2)
Pull Request Labels
skip-changelog (178) dependencies (140) documentation (97) bug (73) ci (70) stale (65) enhancement (61) visualization (60) javascript (40) BREAKING (18) awaiting feedback (12) python (12) todo (8) duplicate (3) good first issue (2)

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 7,060,418 last-month
  • Total docker downloads: 398,536
  • Total dependent packages: 300
    (may contain duplicates)
  • Total dependent repositories: 2,706
    (may contain duplicates)
  • Total versions: 144
  • Total maintainers: 3
pypi.org: shap

A unified approach to explain the output of any machine learning model.

  • Versions: 107
  • Dependent Packages: 298
  • Dependent Repositories: 2,655
  • Downloads: 7,060,418 Last month
  • Docker Downloads: 398,536
Rankings
Dependent packages count: 0.1%
Stargazers count: 0.1%
Downloads: 0.1%
Forks count: 0.2%
Dependent repos count: 0.2%
Average: 0.3%
Docker downloads count: 1.0%
Maintainers (3)
Last synced: 8 months ago
proxy.golang.org: github.com/shap/shap
  • Versions: 32
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
anaconda.org: shap

SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machine learning model. SHAP connects game theory with local explanations, uniting several previous methods and representing the only possible consistent and locally accurate additive feature attribution method based on expectations.

  • Versions: 5
  • Dependent Packages: 2
  • Dependent Repositories: 51
Rankings
Stargazers count: 4.6%
Forks count: 5.9%
Average: 13.7%
Dependent packages count: 20.5%
Dependent repos count: 24.0%
Last synced: 6 months ago

Dependencies

.github/workflows/build_wheels.yml actions
  • actions/checkout v2 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v2 composite
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/run_tests.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v2 composite
.github/workflows/test_js.yml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
docs/user_studies/sickness_scores/package.json npm
  • babel-core ^6 development
  • babel-loader ^6 development
  • babel-preset-es2015 ^6 development
  • babel-preset-react ^6 development
  • webpack ^2 development
  • webpack-dev-server ^2 development
  • d3 ^4
  • lodash ^4
  • material-ui ^0.16
  • react ^15
  • react-dom ^15
  • react-router ^3
  • react-tap-event-plugin ^2
javascript/package-lock.json npm
  • 708 dependencies
javascript/package.json npm
  • @babel/core ^7.9.0 development
  • @babel/preset-env ^7.22.9 development
  • @babel/preset-react ^7.22.5 development
  • babel-jest ^29.6.1 development
  • babel-loader ^8.1.0 development
  • jest ^29.6.1 development
  • jest-environment-jsdom ^29.6.1 development
  • react-test-renderer ^15.7.0 development
  • webpack ^5.88.2 development
  • webpack-cli ^4.10.0 development
  • webpack-dev-server ^4.15.1 development
  • d3 ^7
  • lodash ^4
  • react ^15
  • react-dom ^15
  • react-tap-event-plugin ^2
pyproject.toml pypi
  • cloudpickle *
  • numba *
  • numpy *
  • packaging >20.9
  • pandas *
  • scikit-learn *
  • scipy *
  • slicer ==0.0.7
  • tqdm >=4.27.0
requirements.txt pypi
setup.py pypi