SHAP (SHapley Additive exPlanations)

August 03, 2022

Photo by Nubelson Fernandes on Unsplash

SHAP (SHapley Additive exPlanations)

In recent years, there have been multiple scandals involving a machine learning model that made an unjust decision on the basis of gender or race. The EU is seeking to pass legislation requiring AI systems to meet certain transparency requirements. Specifically, human beings must be able to audit decisions made by a model in order to determine what factors lead to a given prediction.


Only a subset of machine learning models are considered intrinsically interpretable. These include Linear Regression, Logistic Regression, Naive Bayes, Decision Trees and other tree based models (e.g. Random Forest).

Linear Models

Linear models learn a set of coefficients that are then used in a weighted sum to make a prediction. These coefficients can be interpreted as a crude type of feature importance score.

Let’s try running some code to gain a greater understanding of what we mean here. We begin by importing the required libraries and downloading the Boston housing dataset.

from sklearn.linear_model import LinearRegression  
from matplotlib import pyplot as plt  
from sklearn.datasets import load_boston
X =  
y =

We instantiate an instance of the LinearRegression class and train the model.

lr = LinearRegression(), y)

We plot the coefficients associated with each feature.

sorted_idx = lr.coef_.argsort()  
plt.barh(boston.feature_names[sorted_idx], lr.coef_[sorted_idx])

Looking at the bar chart, you might think to yourself that a small change in the nitric oxides concentration (NOX) produces the largest change (negative) in the price of a house relative to all other features. However, in reality the coefficient is larger than the rest because of the nature of the data. For example, the feature B is on the order of hundreds whereas the feature NOX is on the order of tenths and the price is on the order of tens. Therefore, we’d expect the coefficient of the variable associated with the feature B to be much smaller than the one associated with the feature NOX since it would take a greater absolute change in its value to produce a comparable change to the dependent variable.

Correlation coefficients are a standardized alternative to regression coefficients. Correlation values all fall between -1 and +1. Thus, you can use them to compare the strength of the relationships between different pairs of variables despite differing units of measurement.

from sklearn.feature_selection import r_regression  
result = r_regression(X, y)  
sorted_idx = result.argsort()  
plt.barh(boston.feature_names[sorted_idx], result[sorted_idx])

As we can see, the LSTAT feature has the largest influence on the house price.

Decision Tree Models

Tree-based algorithms have built-in feature importance. Every decision tree is composed of intermediate nodes and leaves. The features for intermediate nodes are selected based on Gini impurity for classification tasks and variance reduction in the case of regression. To elaborate, we measure the amount each feature has reduced the variance or Gini impurity compared to the parent node. Then, the feature that produces the largest decrease is used as the basis for the split. We take the average amount each feature decreases the impurity (or variance) across all trees in the forest. This number is the measure of the feature’s importance. The latter is modified such that each importance represents share of the overall model importance.

In this example, we will make use of the XGBoost algorithm. We begin by importing the required libraries and downloading the Boston housing dataset.

from xgboost import XGBRegressor  
from matplotlib import pyplot as plt  
from sklearn.datasets import load_boston
X =  
y =

We instantiate an instance of the XGBRegressor class and train the model.

model = XGBRegressor(), y)

We plot the importance of each feature used in making the final prediction by accessing the feature_importances_ property.

sorted_idx = model.feature_importances_.argsort()  
plt.barh(boston.feature_names[sorted_idx], model.feature_importances_[sorted_idx])

As we can see, the percentage of lower class people (LSTAT) in the area had the greatest impact on the model’s prediction.


In machine learning, black box describes models that cannot be understood by looking at their parameters (e.g. neural network, SVM). Model-agnostic methods can be used on any machine learning model after it has been trained in order to understand why it came to certain conclusions.


As the name implies, this method will use random feature combinations as input and compute the change(s) in the model’s performance. The features which impact the performance the most are considered the most important ones.

Let’s try this approach on an algorithm that is not natively explainable (i.e. k-nearest neighbors). Just like we’ve done before, we import the required libraries and download the dataset.

from sklearn.inspection import permutation_importance  
from sklearn.neighbors import KNeighborsRegressor  
from matplotlib import pyplot as plt  
from sklearn.datasets import load_boston
X =  
y =

We instantiate an instance of the KNeighborsRegressor class and train the model.

knn = KNeighborsRegressor(), y)

We use the function provided by sklearn.

results = permutation_importance(knn, X, y, scoring='neg_mean_squared_error')

We plot the importance of each feature used in making the final prediction.

sorted_idx = results.importances_mean.argsort()  
plt.barh(boston.feature_names[sorted_idx], results.importances_mean[sorted_idx])

The permutation based method is computationally expensive and can have problems with highly-correlated features.

For example, suppose our dataset contained two features: square footage and number of rooms. The two features are highly correlated. Thus, when increasing the size of the house, you would expect the number of rooms to increase as well. However, if, while generating the permutations, you simply use different values for the number of rooms while keeping all the other feature values fixed, you end up with data points that would not exist in reality.

Secondly, introducing a feature can decrease the importance of a correlated feature because the importance will be split between both features (in the case of tree-based models). For example, let’s assume that the number of rooms has the greatest influence on the price of a house. We add the square footage feature to our dataset and train the model. Some of the trees in the random forest use the number of rooms as the basis for their split(s), others the square footage, others both and others none. Instead of being at the top of the list of important features, each feature is now somewhere in the middle since their effect on the outcome is in essence shared.


SHapley Additive exPlanations, or SHAP for short, is a game theoretic approach to explain the output of any machine learning model.

To make use of SHAP in Python, we import the following library:

import shap

When the algorithm is set to auto (default) the Explainer class will automatically determine what to use. Since we’re passing a tree-based model, it will default to the TreeExplainer class. The TreeExplainer implementation provides fast local explanations with guaranteed consistency. Unlike the KernelExplainer which must approximate Shapley values, the TreeExplainer can compute Shapley values in low-order polynomial time by leveraging the internal structure of tree-based models.

We obtain the Shapley values as follows:

explainer = shap.Explainer(model)  
shap_values = explainer(X)

Features with large absolute Shapley values are considered important. Since we want to obtain the global importance, we average the absolute Shapley values per feature across the entire dataset. Fortunately, the shap library provides a handy function summary_plot for plotting the global importance calculated using the method just described.

shap.summary_plot(shap_values, X, plot_type="bar")

Until now, we’ve only dealt with global model interpretability. However, often times we benefit from interpreting the model’s predictions at a local level (i.e. prediction for a single sample). There exists two approaches that can quantify a feature’s local importance for an individual prediction made by a tree-based model. The first is simply reporting the decision path, which is unhelpful for ensembles of many trees, and the second is the the Saabas method, which doesn’t give splits near the root enough credit compared to those near the leaves. Fortunately, we can use SHAP.


On the left hand side, we see the values of the different features for the data point in consideration. E[f(X)] is the baseline (mean or median in the case of regression). f(x) is the value predicted by our model. In the center, we have the SHAP values. Blue means that the feature value lead the model to decrease its prediction whereas red implies that the feature value lead the model to increase its prediction. It’s important to note that these SHAP values are valid for this observation only. With other data points the SHAP values will change. In order to understand the importance or contribution of the features for the whole dataset, another plot (i.e. bee swarm plot) can be used.

shap.summary_plot(shap_values, X)

Like many other permutation-based interpretation methods, the Shapley value method suffers from inclusion of unrealistic data instances when features are correlated.


In this day and age, you cannot deploy a model to production without first being able to explain how it made certain predictions. Certain machine learning algorithms are interpretable whereas others are considered a black box. In the case of the latter, we can use model-agnostic techniques to show what features lead to a decision regarding a data point.

Profile picture

Written by Cory Maklin Genius is making complex ideas simple, not making simple ideas complex - Albert Einstein You should follow them on Twitter