ML Explainability: SHAP, LIME, Integrated Gradients and EU AI Act
Model predicts credit score 340 (reject). Customer: "Why?" Compliance: "Show documentation and decision explanation." EU AI Act 2025+ requires explainability for high-risk systems. "Model decided so" no longer acceptable answer.
Three Levels of Explainability
Global: understand model overall. Which features matter? How each feature affects prediction on average? Tools: SHAP summaries, PDP (Partial Dependence), permutation importance.
Local: explain specific prediction. Why this credit rejected? Which pixels led "cat" classification? Tools: SHAP waterfall, LIME, Integrated Gradients.
Contrastive: answer "what if." If income $10k higher — would approve? Tools: DiCE (Diverse Counterfactual Explanations), alibi.
SHAP: Tabular Data Standard
SHAP (SHapley Additive exPlanations) based on cooperative game theory. Each feature gets contribution to prediction deviation from dataset average. Mathematically sound, satisfies efficiency, symmetry, dummy, additivity properties.
import shap
explainer = shap.TreeExplainer(lgbm_model)
shap_values = explainer.shap_values(X_test)
# Waterfall for one prediction
shap.plots.waterfall(explainer(X_test)[0])
# Summary for all
shap.summary_plot(shap_values, X_test, feature_names=feature_names)
TreeExplainer — fast exact for tree models (LightGBM, XGBoost, Random Forest, CatBoost). O(TLD²) complexity where T trees, L leaves, D depth. 1000 trees depth 6 — milliseconds per explanation.
LinearExplainer — linear models (logistic, Ridge). Analytical, instant.
KernelExplainer — model-agnostic, any model. Slow: O(2^M) for M features. Practice uses nsamples=1000–5000 approximation. For neural nets — prefer DeepExplainer or GradientExplainer.
Common problem: SHAP for correlated features distributes evenly — mathematically correct but visually confusing. Features income and income_log have similar SHAP though only one used. Solution: remove feature duplicates before training.
LIME: Faster, Less Precise, Good for NLP
LIME (Local Interpretable Model-Agnostic Explanations) builds local linear approximation around example. Faster than SHAP for complex neural networks, unstable: two runs same example may give different explanations.
Strength — text explanations. LimeTextExplainer shows which words affected classification. For quick debugging text classifier — convenient tool.
from lime.lime_text import LimeTextExplainer
explainer = LimeTextExplainer(class_names=['neg', 'pos'])
exp = explainer.explain_instance(text, classifier.predict_proba, num_features=10)
exp.show_in_notebook()
Integrated Gradients for Neural Networks
For deep learning (CNN, Transformer) neither SHAP KernelExplainer nor LIME satisfactory: too slow or inaccurate. Integrated Gradients (IG) — gradient-based, theoretically justified (axioms: completeness, sensitivity, implementation invariance).
IG computes gradient integral along line from baseline (zeros or mean) to real input. Result: attribution map — pixel/token contribution.
from captum.attr import IntegratedGradients
ig = IntegratedGradients(model)
attributions = ig.attribute(
inputs=input_tensor,
baselines=baseline_tensor,
target=predicted_class,
n_steps=300,
)
Meta's captum library — PyTorch standard. Includes IG, GradCAM, SHAP DeepLift, LayerConductance.
GradCAM — simpler, faster, theoretically weaker. Shows CNN attention areas. Debugging CV, insufficient for compliance docs.
EU AI Act: What's Needed in Practice
EU AI Act (phases 2024–2026) requires for high-risk systems (credit scoring, medical AI, hiring, law enforcement):
- Technical model documentation
- Decision logging with audit trail
- Individual decision explanation on request
- Risk assessment and mitigation
- Human oversight
Technically: every prediction saved with input features, output, timestamp, model version, pre-computed explanation. SHAP values computed at inference and stored with prediction.
For LLM systems harder: no standard explanation method, attention weights not reliable attribution. Current practice — full context logging, retrieved RAG chunks, chain-of-thought reasoning as proxy explanation.
Pre-deployment for high-risk:
- Assessment: does system fall under AI Act Annex III high-risk
- Technical passport: architecture, training data, metrics, limitations
- Logging system: decisions + retention (10+ years some categories)
- Explanation mechanism in production pipeline
- User dispute procedure
Model Cards and Documentation
Google's Model Card Toolkit — standard documentation. Captures: intended use, evaluation results by demographic/subgroup, limitations, ethical considerations.
For sklearn/LightGBM — skorecard and ydata-profiling generate basic docs auto. Production needs custom per organization.
Workflow
Start regulatory assessment: does system fall under EU AI Act high-risk, GDPR Article 22 (auto decisions), industry-specific (Basel IV banking, MDR medical).
Then: integrate SHAP into inference pipeline, develop explanation UI (if need client-facing), setup logging, prepare model documentation.
Timelines: add SHAP explanations to ready model — 1–2 weeks. Full compliance solution with docs, UI, logging — 6–14 weeks.







