| Debiasing concepts |
Debiasing Concept Bottleneck Models with Instrumental Variables |
ICLR 2021 submissions page - Accepted as Poster |
|
causality |
| Prototype Trajectory |
Interpretable Sequence Classification Via Prototype Trajectory |
ICLR 2021 submissions page |
|
this looks like that styled RNN |
| Shapley dependence assumption |
Shapley explainability on the data manifold |
ICLR 2021 submissions page |
|
|
| High dimension Shapley |
Human-interpretable model explainability on high-dimensional data |
ICLR 2021 submissions page |
|
|
| L2x like paper |
A Learning Theoretic Perspective on Local Explainability |
ICLR 2021 submissions page |
|
|
| Evaluation |
Evaluation of Similarity-based Explanations |
ICLR 2021 submissions page |
|
like adebayo paper for this looks like that styled methods |
| Model correction |
Defuse: Debugging Classifiers Through Distilling Unrestricted Adversarial Examples |
ICLR 2021 submissions page |
|
|
| Subspace explanation |
Constraint-Driven Explanations of Black-Box ML Models |
ICLR 2021 submissions page |
|
to see how close to MUSE by Hima Lakkaraju 2019 |
| Catastrophic forgetting |
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting |
ICLR 2021 submissions page |
Code available in their Supplementary zip file |
|
| Non trivial counterfactual explanations |
Beyond Trivial Counterfactual Generations with Diverse Valuable Explanations |
ICLR 2021 submissions page |
|
|
| Explainable by Design |
Interpretability Through Invertibility: A Deep Convolutional Network With Ideal Counterfactuals And Isosurfaces |
ICLR 2021 submissions page |
|
|
| Gradient attribution |
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability |
ICLR 2021 submissions page |
|
looks like extension of Sixt et al paper |
| Mask based Explainable by Design |
Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability |
ICLR 2021 submissions page |
|
|
| NBDT - Explainable by Design |
NBDT: Neural-Backed Decision Tree |
ICLR 2021 submissions page |
|
|
| Variational Saliency Maps |
Variational saliency maps for explaining model's behavior |
ICLR 2021 submissions page |
|
|
| Network dissection with coherency or stability metric |
Importance and Coherence: Methods for Evaluating Modularity in Neural Networks |
ICLR 2021 submissions page |
|
|
| Modularity |
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks |
ICLR 2021 submissions page |
Code made anonymous for review, link given in paper |
|
| Explainable by design |
A self-explanatory method for the black problem on discrimination part of CNN |
ICLR 2021 submissions page |
|
seems concepts of game theory applied |
| Attention not Explanation |
Why is Attention Not So Interpretable? |
ICLR 2021 submissions page |
|
|
| Ablation Saliency |
Ablation Path Saliency |
ICLR 2021 submissions page |
|
|
| Explainable Outlier Detection |
Explainable Deep One-Class Classification |
ICLR 2021 submissions page |
|
|
| XAI without approximation |
Explainable AI Wthout Interpretable Model |
Arxiv |
|
|
| Learning theoretic Local Interpretability |
A LEARNING THEORETIC PERSPECTIVE ON LOCAL EXPLAINABILITY |
Arxiv |
|
|
| GANMEX |
GANMEX: ONE-VS-ONE ATTRIBUTIONS USING GAN-BASED MODEL EXPLAINABILITY |
Arxiv |
|
|
| Evaluating Local Explanations |
Evaluating local explanation methods on ground truth |
Artificial Intelligence Journal Elsevier |
sklearn |
|
| Structured Attention Graphs |
Structured Attention Graphs for Understanding Deep Image Classifications |
AAAI 2021 |
PyTorch |
see how close to MACE |
| Ground truth explanations |
Data Representing Ground-Truth Explanations to Evaluate XAI Methods |
AAAI 2021 |
sklearn |
trained models available in their github repository |
| AGF |
Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization |
AAAI 2021 |
PyTorch |
|
| RSP |
Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations |
AAAI 2021 |
|
|
| HyDRA |
HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks |
AAAI 2021 |
PyTorch |
|
| SWAG |
SWAG: Superpixels Weighted by Average Gradients for Explanations of CNNs |
WACV 2021 |
|
|
| FastIF |
FASTIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging |
Arxiv |
PyTorch |
|
| EVET |
EVET: Enhancing Visual Explanations of Deep Neural Networks Using Image Transformations |
WACV 2021 |
|
|
| Local Attribution Baselines |
On Baselines for Local Feature Attributions |
AAAI 2021 |
PyTorch |
|
| Differentiated Explanations |
Differentiated Explanation of Deep Neural Networks with Skewed Distributions |
IEEE - TPAMI journal |
PyTorch |
|
| Human game based survey |
Explainable AI and Adoption of Algorithmic Advisors: an Experimental Study |
Arxiv |
|
|
| Explainable by design |
Learning Semantically Meaningful Features for Interpretable Classifications |
Arxiv |
|
|
| Expred |
Explain and Predict, and then Predict again |
ACM WSDM 2021 |
PyTorch |
|
| Progressive Interpretation |
An Information-theoretic Progressive Framework for Interpretation |
Arxiv |
PyTorch |
|
| UCAM |
Uncertainty Class Activation Map (U-CAM) using Gradient Certainty method |
IEEE - TIP |
Project Page |
PyTorch |
| progressive GAN explainability- smiling dataset- ICLR 2020 group |
Explaining the Black-box Smoothly - A Counterfactual Approach |
Arxiv |
|
|
| Head pasted in another image - experimented |
WHAT DO DEEP NETS LEARN? CLASS-WISE PATTERNS REVEALED IN THE INPUT SPACE |
Arxiv |
|
|
| Model correction |
ExplOrs Explanation Oracles and the architecture of explainability |
Paper |
|
|
| Explanations - Knowledge Representation |
A Basic Framework for Explanations in Argumentation |
IEEE |
|
|
| Eigen CAM |
Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks |
Springer |
|
|
| Evaluation of Posthoc |
How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations |
ACM |
|
|
| GLocalX |
GLocalX - From Local to Global Explanations of Black Box AI Models |
Arxiv |
|
|
| Consistent Interpretations |
Explainable Models with Consistent Interpretations |
AAAI 2021 |
|
|
| SIDU |
Introducing and assessing the explainable AI (XAI) method: SIDU |
Arxiv |
|
|
| cites This looks like that |
Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies |
AIJ |
|
|
| i-Algebra |
i-Algebra: Towards Interactive Interpretability of Deep Neural Networks |
AAAI 2021 |
|
|
| Shape texture bias |
SHAPE OR TEXTURE: UNDERSTANDING DISCRIMINATIVE FEATURES IN CNNS |
ICLR 2021 |
|
|
| Class agnostic features |
THE MIND’S EYE: VISUALIZING CLASS-AGNOSTIC FEATURES OF CNNS |
Arxiv |
|
|
| IBEX |
A Multi-layered Approach for Tailored Black-box Explanations |
Paper |
Code |
|
| Relevant explanations |
Learning Relevant Explanations |
Paper |
|
|
| Guided Zoom |
Guided Zoom: Zooming into Network Evidence to Refine Fine-grained Model Decisions |
IEEE |
|
|
| XAI survey |
A Survey on Understanding, Visualizations, and Explanation of Deep Neural Networks |
Arxiv |
|
|
| Pattern theory |
Convolutional Neural Network Interpretability with General Pattern Theory |
Arxiv |
PyTorch |
|
| Gaussian Process based explanations |
Bandits for Learning to Explain from Explanations |
AAAI 2021 |
sklearn |
|
| LIFT CAM |
LIFT-CAM: Towards Better Explanations for Class Activation Mapping |
Arxiv |
|
|
| ObAIEx |
Right for the Right Reasons: Making Image Classification Intuitively Explainable |
Paper |
tensorflow |
|
| VAE based explainer |
Combining an Autoencoder and a Variational Autoencoder for Explaining the Machine Learning Model Predictions |
IEEE |
|
|
| Segmentation based explanation |
Deep Co-Attention Network for Multi-View Subspace Learning |
Arxiv |
PyTorch |
|
| Integrated CAM |
INTEGRATED GRAD-CAM: SENSITIVITY-AWARE VISUAL EXPLANATION OF DEEP CONVOLUTIONAL NETWORKS VIA INTEGRATED GRADIENT-BASED SCORING |
ICASSP 2021 |
PyTorch |
|
| Human study |
VitrAI - Applying Explainable AI in the Real World |
Arxiv |
|
|
| Attribution Mask |
Attribution Mask: Filtering Out Irrelevant Features By Recursively Focusing Attention on Inputs of DNNs |
Arxiv |
PyTorch |
|
| LIME faithfulness |
What does LIME really see in images? |
Arxiv |
Tensorflow 1.x |
|
| Assess model reliability |
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs |
Arxiv |
|
|
| Perturbation + Gradient unification |
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations |
Arxiv |
|
hima lakkaraju |
| Gradients faithful? |
Do Input Gradients Highlight Discriminative Features? |
Arxiv |
PyTorch |
|
| Untrustworthy predictions |
Identifying Untrustworthy Predictions in Neural Networks by Geometric Gradient Analysis |
Arxiv |
|
|
| Explaining misclassification |
Explaining Inaccurate Predictions of Models through k-Nearest Neighbors |
Paper |
|
cites Oscar Li AAAI 2018 prototypes paper |
| Explanations inside predictions |
Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations |
AISTATS 2021 |
|
|
| Layerwise interpretation |
LAYER-WISE INTERPRETATION OF DEEP NEURAL NETWORKS USING IDENTITY INITIALIZATION |
Arxiv |
|
|
| Visualizing Rule Sets |
Visualizing Rule Sets: Exploration and Validation of a Design Space |
Arxiv |
PyTorch |
|
| Human experiments |
Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making |
IUI 2021 |
|
|
| Attention fine-grained classification |
Interpretable Attention Guided Network for Fine-grained Visual Classification |
Arxiv |
|
|
| Concept construction |
Explaining Classifiers by Constructing Familiar Concepts |
Paper |
PyTorch |
|
| EbD |
Human-Understandable Decision Making for Visual Recognition |
Arxiv |
|
|
| Bridging XAI algorithm , Human needs |
Towards Connecting Use Cases and Methods in Interpretable Machine Learning |
Arxiv |
|
|
| Generative trustworthy classifiers |
Generative Classifiers as a Basis for Trustworthy Image Classification |
Paper |
Github |
|
| Counterfactual explanations |
Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties |
AISTATS 2021 |
PyTorch |
|
| Role categorization of CNN units |
Quantitative Effectiveness Assessment and Role Categorization of Individual Units in Convolutional Neural Networks |
ICML 2021 |
|
|
| Non-trivial counterfactual explanations |
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations |
Arxiv |
|
|
| NP-ProtoPNet |
These do not Look Like Those: An Interpretable Deep Learning Model for Image Recognition |
IEEE |
|
|
| Correcting neural networks based on explanations |
Refining Neural Networks with Compositional Explanations |
Arxiv |
Code link given in paper, but page not found |
|
| Contrastive reasoning |
Contrastive Reasoning in Neural Networks |
Arxiv |
|
|
| Concept based |
Intersection Regularization for Extracting Semantic Attributes |
Arxiv |
|
|
| Boundary explanations |
Boundary Attributions Provide Normal (Vector) Explanations |
Arxiv |
PyTorch |
|
| Generative Counterfactuals |
ECINN: Efficient Counterfactuals from Invertible Neural Networks |
Arxiv |
|
|
| ICE |
Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors |
AAAI 2021 |
|
|
| Group CAM |
Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks |
Arxiv |
PyTorch |
|
| HMM interpretability |
Towards interpretability of Mixtures of Hidden Markov Models |
AAAI 2021 |
sklearn |
|
| Empirical Explainers |
Efficient Explanations from Empirical Explainers |
Arxiv |
PyTorch |
|
| FixNorm |
FIXNORM: DISSECTING WEIGHT DECAY FOR TRAINING DEEP NEURAL NETWORKS |
Arxiv |
|
|
| CoDA-Net |
Convolutional Dynamic Alignment Networks for Interpretable Classifications |
CVPR 2021 |
Code link given in paper. Repository not yet created |
|
| Like Dr. Chandru sir's (IITPKD) XAI work |
Neural Response Interpretation through the Lens of Critical Pathways |
Arxiv |
PyTorch- Pathway GradPyTorch - ROAR |
|
| Inaugment |
InAugment: Improving Classifiers via Internal Augmentation |
Arxiv |
Code yet to be updated |
|
| Gradual Grad CAM |
Enhancing Deep Neural Network Saliency Visualizations with Gradual Extrapolation |
Arxiv |
PyTorch |
|
| A-FMI |
A-FMI: LEARNING ATTRIBUTIONS FROM DEEP NETWORKS VIA FEATURE MAP IMPORTANCE |
Arxiv |
|
|
| Trust - Regression |
To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions |
AAAI 2021 |
sklearn |
|
| Concept based explanations - study |
IS DISENTANGLEMENT ALL YOU NEED? COMPARING CONCEPT-BASED & DISENTANGLEMENT APPROACHES |
ICLR 2021 workshop |
tensorflow 2.3 |
|
| Faithful attribution |
Mutual Information Preserving Back-propagation: Learn to Invert for Faithful Attribution |
Arxiv |
|
|
| Counterfactual explanation |
Counterfactual attribute-based visual explanations for classification |
Springer |
|
|
| User based explanations |
That's (not) the output I expected!” On the role of end user expectations in creating explanations of AI systems |
AIJ |
|
|
| Human understandable concept based explanations |
Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed |
Arxiv |
|
|
| Improved attribution |
Improving Attribution Methods by Learning Submodular Functions |
Arxiv |
|
|
| SHAP tractability |
On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results |
Arxiv |
|
|
| SHAP explanation network |
SHAPLEY EXPLANATION NETWORKS |
ICLR 2021 |
PyTorch |
|
| Concept based dataset shift explanation |
FAILING CONCEPTUALLY: CONCEPT-BASED EXPLANATIONS OF DATASET SHIFT |
ICLR 2021 workshop |
tensorflow 2 |
|
| EbD |
Towards Human-Understandable Visual Explanations: Imperceptible High-frequency Cues Can Better Be Removed |
Arxiv |
|
|
| Evaluating CAM |
Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis |
Arxiv |
|
|
| EFC-CAM |
Exclusive Feature Constrained Class Activation Mapping for Better Visual Explanation |
IEEE |
|
|
| Causal Interpretation |
Instance-wise Causal Feature Selection for Model Interpretation |
Arxiv |
PyTorch |
|
| Fairness in Learning |
Learning to Learn to be Right for the Right Reasons |
Arxiv |
|
|
| Feature attribution correctness |
Do Feature Attribution Methods Correctly Attribute Features? |
Arxiv |
Code not yet updated |
|
| NICE |
NICE: AN ALGORITHM FOR NEAREST INSTANCE COUNTERFACTUAL EXPLANATIONS |
Arxiv |
Own Python Package |
|
| SCG |
A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts |
Arxiv |
|
|
| Visual Concepts |
A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts |
Arxiv |
|
|
| This looks like that - drawback |
This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks |
Arxiv |
PyTorch |
|
| Exemplar based classification |
Visualizing Association in Exemplar-Based Classification |
ICASSP 2021 |
|
|
| Correcting classification |
CORRECTING CLASSIFICATION: A BAYESIAN FRAMEWORK USING EXPLANATION FEEDBACK TO IMPROVE CLASSIFICATION ABILITIES |
Arxiv |
|
|
| Concept Bottleneck Networks |
DO CONCEPT BOTTLENECK MODELS LEARN AS INTENDED? |
ICLR workshop 2021 |
|
|
| Sanity for saliency |
Sanity Simulations for Saliency Methods |
Arxiv |
|
|
| Concept based explanations |
Cause and Effect: Concept-based Explanation of Neural Networks |
Arxiv |
|
|
| CLIMEP |
How to Explain Neural Networks: A perspective of data space division |
Arxiv |
|
|
| Sufficient explanations |
Probabilistic Sufficient Explanations |
Arxiv |
Empty Repository |
|
| SHAP baseline |
Learning Baseline Values for Shapley Values |
Arxiv |
|
|
| Explainable by Design |
EXoN: EXplainable encoder Network |
Arxiv |
tensorflow 2.4.0 |
explainable VAE |
| Concept based explanations |
Aligning Artificial Neural Networks and Ontologies towards Explainable AI |
AAAI 2021 |
|
|
| XAI via Bayesian teaching |
ABSTRACTION, VALIDATION, AND GENERALIZATION FOR EXPLAINABLE ARTIFICIAL INTELLIGENCE |
Arxiv |
|
|
| Explanation blind spots |
DO NOT EXPLAIN WITHOUT CONTEXT: ADDRESSING THE BLIND SPOT OF MODEL EXPLANATIONS |
Arxiv |
|
|
| BLA |
Bounded logit attention: Learning to explain image classifiers |
Arxiv |
tensorflow |
L2X++ |
| Interpretability - mathematical model |
The Definitions of Interpretability and Learning of Interpretable Models |
Arxiv |
|
|
| Similar to our ICML workshop 2021 work |
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores |
Arxiv |
|
|
| EDDA |
EDDA: Explanation-driven Data Augmentation to Improve Model and Explanation Alignment |
Arxiv |
|
|
| Relevant set explanations |
Efficient Explanations With Relevant Sets |
Arxiv |
|
|
| Model transfer |
Making CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy Learning |
Arxiv |
|
|
| Model correction |
Finding and Fixing Spurious Patterns with Explanations |
Arxiv |
|
|
| Neuron graph communities |
On the Evolution of Neuron Communities in a Deep Learning Architecture |
Arxiv |
|
|
| Mid level features explanations |
A general approach for Explanations in terms of Middle Level Features |
Arxiv |
|
see how different from MUSE by Hima Lakkaraju group |
| Concept based knowledge distillation |
Towards Black-Box Explainability with Gaussian Discriminant Knowledge Distillation |
CVPR 2021 workshop |
|
compare and contrast with network dissection |
| CNN high frequency bias |
Dissecting the High-Frequency Bias in Convolutional Neural Networks |
CVPR 2021 workshop |
Tensorflow |
|
| Explainable by design |
Entropy-based Logic Explanations of Neural Networks |
Arxiv |
PyTorch |
concept based |
| CALM |
Keep CALM and Improve Visual Feature Attribution |
Arxiv |
PyTorch |
|
| Relevance CAM |
Relevance-CAM: Your Model Already Knows Where to Look |
CVPR 2021 |
PyTorch |
|
| S-LIME |
S-LIME: Stabilized-LIME for Model Explanation |
Arxiv |
sklearn |
|
| Local + Global |
Best of both worlds: local and global explanations with human-understandable concepts |
Arxiv |
|
Been Kim's group |
| Guided integrated gradients |
Guided Integrated Gradients: an Adaptive Path Method for Removing Noise |
CVPR 2021 |
|
|
| Concept based |
Meaningfully Explaining a Model’s Mistakes |
Arxiv |
|
|
| Explainable by design |
It’s FLAN time! Summing feature-wise latent representations for interpretability |
Arxiv |
|
|
| SimAM |
SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks |
ICML 2021 |
PyTorch |
|
| DANCE |
DANCE: Enhancing saliency maps using decoys |
ICML 2021 |
Tensorflow 1.x |
|
| EbD Concept formation |
Explore Visual Concept Formation for Image Classification |
ICML 2021 |
PyTorch |
|
| Explainable by design |
Interpretable Compositional Convolutional Neural Networks |
Arxiv |
|
|
| Attribution aggregation |
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation |
AAAI 2021 - pdf |
|
|
| Perturbation based activation |
A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation |
AAAI 2021 |
|
|
| Global explanations |
Feature Synergy, Redundancy, and Independence in Global Model Explanations using SHAP Vector Decomposition |
Arxiv |
Github package |
|
| L2E |
Learning to Explain: Generating Stable Explanations Fast |
ACL 2021 |
PyTorch |
NLE |
| Joint Shapley |
Joint Shapley values: a measure of joint feature importance |
Arxiv |
|
|
| Explainable by design |
Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment |
Arxiv |
|
|
| Explainable by design |
SONG: SELF-ORGANIZING NEURAL GRAPHS |
Arxiv |
|
|
| Explainable by design |
Designing Shapelets for Interpretable Data-Agnostic Classification |
AIES 2021 |
sklearn |
Interpretable block of time series extended to other data modalitites like image, text, tabular |
| Global explanations + Model correction |
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability |
Arxiv |
PyTorch |
|
| HIL- Model correction |
Human-in-the-loop Extraction of Interpretable Concepts in Deep Learning Models |
Arxiv |
|
|
| Activation based Cause Analysis |
Activation-Based Cause Analysis Method for Neural Networks |
IEEE Access 2021 |
|
|
| Local explanations |
Leveraging Latent Features for Local Explanations |
ACM SIGKDD 2021 |
|
Amit Dhurandhar group |
| Fairness |
Adequate and fair explanations |
Arxiv - Accepted in CD-MAKE 2021 |
|
|
| Global explanations |
Finding Representative Interpretations on Convolutional Neural Networks |
ICCV 2021 |
|
|
| Groupwise explanations |
Learning Groupwise Explanations for Black-Box Models |
IJCAI 2021 |
PyTorch |
|
| Mathematical |
On Smoother Attributions using Neural Stochastic Differential Equations |
IJCAI 2021 |
|
|
| AGI |
Explaining Deep Neural Network Models with Adversarial Gradient Integration |
IJCAI 2021 |
PyTorch |
|
| Accountable attribution |
Longitudinal Distance: Towards Accountable Instance Attribution |
Arxiv |
Tensorflow Keras |
|
| Global explanation |
Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images |
Arxiv |
|
|
| Concepts based - Explainable by design |
Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach |
Arxiv |
|
IITH Vineeth sir group |
| Explainable by design |
This looks more like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation |
Arxiv |
|
|
| MIL |
ProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained Interpretability |
Arxiv |
|
|
| Concept based explanations |
Instance-wise or Class-wise? A Tale of Neighbor Shapley for Concept-based Explanation |
Arxiv |
|
|
| Counterfactual explanation + Theory of Mind |
CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models |
Arxiv |
|
|
| Evaluation metric |
Counterfactual Evaluation for Explainable AI |
Arxiv |
|
|
| CIM - FSC |
CIM: Class-Irrelevant Mapping for Few-Shot Classification |
Arxiv |
|
|
| Causal Concepts |
Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation |
Arxiv |
|
|
| ECE |
Ensemble of Counterfactual Explainers |
Paper |
Code - seems hybrid of tf and torch |
|
| Structured Explanations |
From Heatmaps to Structured Explanations of Image Classifiers |
Arxiv |
|
|
| XAI metric |
An Objective Metric for Explainable AI - How and Why to Estimate the Degree of Explainability |
Arxiv |
|
|
| DisCERN |
DisCERN:Discovering Counterfactual Explanations using Relevance Features from Neighbourhoods |
Arxiv |
|
|
| PSEM |
Towards Better Model Understanding with Path-Sufficient Explanations |
Arxiv |
|
Amit Dhurandhar sir group |
| Evaluation traps |
The Logic Traps in Evaluating Post-hoc Interpretations |
Arxiv |
|
|
| Interactive explanations |
Explainability Requires Interactivity |
Arxiv |
PyTorch |
|
| CounterNet |
CounterNet: End-to-End Training of Counterfactual Aware Predictions |
Arxiv |
PyTorch |
|
| Evaluation metric - Concept based explanation |
Detection Accuracy for Evaluating Compositional Explanations of Units |
Arxiv |
|
|
| Explanation - Uncertainity |
Effects of Uncertainty on the Quality of Feature Importance Explanations |
Arxiv |
|
|
| Survey Paper |
TOWARDS USER-CENTRIC EXPLANATIONS FOR EXPLAINABLE MODELS: A REVIEW |
JISTM Journal Paper |
|
|
| Feature attribution |
The Struggles and Subjectivity of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets |
AAAI 2021 workshop |
|
|
| Contextual explanation |
Context-based image explanations for deep neural networks |
Image and Vision Computing Journal |
|
|
| Causal + Counterfactual |
Counterfactual Instances Explain Little |
Arxiv |
|
|
| Case based Posthoc |
Explaining Deep Learning using examples: Optimal feature weighting methods for twin systems using post-hoc, explanation-by-example in XAI |
Elsevier |
|
|
| Debugging gray box model |
Toward a Unified Framework for Debugging Gray-box Models |
Arxiv |
|
|
| Explainable by design |
Optimising for Interpretability: Convolutional Dynamic Alignment Networks |
Arxiv |
|
|
| XAI negative effect |
Explainability Pitfalls: Beyond Dark Patterns in Explainable AI |
Arxiv |
|
|
| Evaluate attributions |
WHO EXPLAINS THE EXPLANATION? QUANTITATIVELY ASSESSING FEATURE ATTRIBUTION METHODS |
Arxiv |
|
|
| Counterfactual explanations |
Designing Counterfactual Generators using Deep Model Inversion |
Arxiv |
|
|
| Model correction using explanation |
Consistent Explanations by Contrastive Learning |
Arxiv |
|
|
| Visualize feature maps |
Visualizing Feature Maps for Model Selection in Convolutional Neural Networks |
ICCV 2021 Workshop |
Tensorflow 1.15 |
|
| SPS |
Stochastic Partial Swap: Enhanced Model Generalization and Interpretability for Fine-grained Recognition |
ICCV 2021 |
PyTorch |
|
| DMBP |
Generating Attribution Maps with Disentangled Masked Backpropagation |
ICCV 2021 |
|
|
| Better CAM |
Towards Better Explanations of Class Activation Mapping |
ICCV 2021 |
|
|
| LEG |
Statistically Consistent Saliency Estimation |
ICCV 2021 |
Keras |
|
| IBA |
Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information |
NeurIPS 2021 |
PyTorch |
|
| Looks similar to This Looks Like That |
Interpretable Image Recognition by Constructing Transparent Embedding Space |
ICCV 2021 |
Code not yet publicly released |
|
| Causal Imagenet |
CAUSAL IMAGENET: HOW TO DISCOVER SPURIOUS FEATURES IN DEEP LEARNING? |
Arxiv |
|
|
| Model correction |
Logic Constraints to Feature Importances |
Arxiv |
|
|
| Receptive field Misalignment CAM |
On the Receptive Field Misalignment in CAM-based Visual Explanations |
Pattern recognition Letters |
PyTorch |
|
| Simplex |
Explaining Latent Representations with a Corpus of Examples |
Arxiv |
PyTorch |
|
| Sanity checks |
Revisiting Sanity Checks for Saliency Maps |
Arxiv - NeurIPS 2021 workshop |
|
|
| Model correction |
Debugging the Internals of Convolutional Networks |
PDF |
|
|
| SITE |
Self-Interpretable Model with Transformation Equivariant Interpretation |
Arxiv |
Accepted at NeurIPS 2021 |
EbD |
| Influential examples |
Revisiting Methods for Finding Influential Examples |
Arxiv |
|
|
| SOBOL |
Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis |
NeurIPS 2021 |
Tensorflow and PyTorch |
|
| Feature vectors |
Beyond Importance Scores: Interpreting Tabular ML by Visualizing Feature Semantics |
Arxiv |
|
global interpretability |
| OOD in explainability |
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations |
NeurIPS 2021 |
sklearn |
|
| RPS LJE |
Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models |
NeurIPS 2021 |
PyTorch |
|
| Model correction |
Editing a Classifier by Rewriting Its Prediction Rules |
NeurIPS 2021 |
Code |
|
| suppressor variable litmus test |
Scrutinizing XAI using linear ground-truth data with suppressor variables |
Arxiv |
|
|
| Explainable knowledge distillation |
Learning Interpretation with Explainable Knowledge Distillation |
Arxiv |
|
|
| STEEX |
STEEX: Steering Counterfactual Explanations with Semantics |
Arxiv |
Code |
|
| Binary counterfactual explanation |
Counterfactual Explanations via Latent Space Projection and Interpolation |
Arxiv |
|
|
| ECLAIRE |
Efficient Decompositional Rule Extraction for Deep Neural Networks |
Arxiv |
R |
|
| CartoonX |
Cartoon Explanations of Image Classifiers |
Researchgate |
|
|
| concept based explanation |
Explanations in terms of Hierarchically organised Middle Level Features |
Paper |
|
see how close to MACE and PACE |
| Concept ball |
Ontology-based 𝑛-ball Concept Embeddings Informing Few-shot Image Classification |
Paper |
|
|
| SPARROW |
SPARROW: Semantically Coherent Prototypes for Image Classification |
BMVC 2021 |
|
|
| XAI evaluation criteria |
Objective criteria for explanations of machine learning models |
Paper |
|
|
| Code inversion with human perception |
EXPLORING ALIGNMENT OF REPRESENTATIONS WITH HUMAN PERCEPTION |
Arxiv |
|
|
| Deformable ProtoPNet |
Deformable ProtoPNet: An Interpretable Image Classifier Using Deformable Prototypes |
Arxiv |
|
|
| ICSN |
Interactive Disentanglement: Learning Concepts by Interacting with their Prototype Representations |
Arxiv |
|
|
| HIVE |
HIVE: Evaluating the Human Interpretability of Visual Explanations |
Arxiv |
Project Page |
|
| Jitter CAM |
Jitter-CAM: Improving the Spatial Resolution of CAM-Based Explanations |
BMVC 2021 |
PyTorch |
|
| Interpreting last layer |
dentifying Class Specific Filters with L1 Norm Frequency Histograms in Deep CNNs |
Arxiv |
|
|
| FCP |
Forward Composition Propagation for Explainable Neural Reasoning |
Arxiv |
|
|
| Protopool |
Interpretable Image Classification with Differentiable Prototypes Assignment |
Arxiv |
|
|
| PRELIM |
Pedagogical Rule Extraction for Learning Interpretable Models |
Arxiv |
|
|
| Fair correction vectors |
FAIR INTERPRETABLE LEARNING VIA CORRECTION VECTORS |
ICLR 2021 |
|
|
| Smooth LRP |
SmoothLRP: Smoothing LRP by Averaging over Stochastic Input Variations |
ESANN 2021 |
|
|
| Causal CAM |
EXTRACTING CAUSAL VISUAL FEATURES FOR LIMITED LABEL CLASSIFICATION |
ICIP 2021 |
|
|