Skip to content

lenbrocki/CDAM

Repository files navigation

Class-Discriminative Attention Maps (CDAM)

Alt text

CDAM is a novel post-hoc explanation method for vision transformers (ViTs) that is highly sensitive to the chosen target class and reveals evidence and counter-evidence through signed relevance scores.

Run notebook locally

To run CDAM.ipynb locally we recommend to create a Python >= 3.9 virtual environment, for example with Mamba. Inside the environment run pip install -r requirements_local.txt and create a Jupyter kernel with python -m ipykernel install --user --name cdam_kernel.

Run notebook on Colab

Open In Colab

Model training

We provide a checkpoint for a classifier head trained on ImageNet. With train_imagenet.py you can also train the model yourself, but have to download the ImageNet dataset first.

Try Demo

Our live demo is the fastest way to try out CDAM!

Introduction

We introduce CDAM, a novel method for visualizing input feature relevance of ViT classifications. CDAM scales the attention by how relevant the corresponding tokens are for the model's decision. Beyond targeting classifier outputs, we create explanations for a similarity measure in the latent space of the ViT. This allows for explanations of arbitary concepts, defined by the user through a few sample images.

CDAMs are obtained by calculating the gradients of the class or similarity score with respect to the tokens entering the final transformer block. The CDAM score $S_{i,c}$ is defined by

$$ S_{i,c} = \sum_j T_{ij}\frac{\partial f_c}{\partial T_{ij}}, $$

where $T_{ij}$ is the $j$-th component of the $i$-th token that is fed to the last attention layer.

About

Official implementation of CDAM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published