[Summary] Add gradient-based attribution methods #122

gsarti · 2022-03-03T09:45:52Z

🚀 Feature Request

The following is a non-exhaustive list of gradient-based feature attribution methods that could be added to the library:

Method name	Source	In Captum	Code implementation	Status
DeepLiftSHAP	-	✅	`pytorch/captum`
GradientSHAP¹	Lundberg and Lee '17	✅	`pytorch/captum`
Guided Backprop	Springenberg et al. '15	✅	`pytorch/captum`
LRP ²	Bach et al. '15	✅	`pytorch/captum`
Guided Integrated Gradients	Kapishnikov et al. '21		`PAIR-code/saliency`
Projected Gradient Descent (PGD) ³	Madry et al. '18, Yin et al. '22		`uclanlp/NLP-Interpretation-Faithfulness`
Sequential Integrated Gradients	Enguehard '23		`josephenguehard/time_interpret`
Greedy PIG ⁴	Axiotis et al. '23
AttnLRP	Achtibat et al. '24		`rachtibat/LRP-for-Transformers`

Notes:

The Deconvolution method can also be added, but it seems to perform the same procedure as Guided Backprop, so it wasn't included to avoid deduplication.

The method was already present in inseq but was removed due to instability in the single example vs. batched setting, reintroducing it will need this problem to be fixed.
Custom rules for the supported architectures need to be defined in order to adapt the LRP attribution method to our use-case. An existing implementation of LRP rules for Transformer models in Tensorflow is available here: [lena-voita/the-story-of-heads](https://github.com/lena-voita/the-story-of-heads).
The method leverage gradient information to perform adversarial replacement, so its collocation in the gradient-based family should be reviewed.
Similar to Sequential Integrated Gradient, but instead of focusing on one word at a time, at every iteration the top features identified by attribution are fixed (i.e. baseline is set to identity) and the remaining ones are attributed again in the next round.

The text was updated successfully, but these errors were encountered:

saxenarohit · 2023-11-20T02:42:29Z

Is there a plan to add LRP to inseq?

gsarti · 2023-12-06T06:58:23Z

Hi @saxenarohit, in principle the Captum LRP implementation should be directly compatible with Inseq. However, the implementation is very model specific with some notable (and to my knowledge, presently unsolved) issues with skip connections, which are the bread and butter of most transformer architectures used in Inseq (see pytorch/captum#546).

I think in general to proceed with an integration we should make sure that:

at least the majority of Inseq-supported models would be compatible with propagation rules currently supported by Captum; and
the misuse of LRP for unsupported architectures would raise informative errors to prevent involuntary misuse

gsarti added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed summary Summarizes multiple sub-tasks labels Mar 3, 2022

gsarti added this to the v1.0 milestone Mar 3, 2022

gsarti removed this from the Demo Paper Release milestone Oct 26, 2022

gsarti pinned this issue Jan 24, 2023

gsarti removed the good first issue Good for newcomers label Jul 20, 2023

gsarti mentioned this issue Feb 12, 2024

[Desiderata] Captum-like implementation for Inseq compatibility rachtibat/LRP-for-Transformers#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Summary] Add gradient-based attribution methods #122

[Summary] Add gradient-based attribution methods #122

gsarti commented Mar 3, 2022 •

edited

saxenarohit commented Nov 20, 2023

gsarti commented Dec 6, 2023

[Summary] Add gradient-based attribution methods #122

[Summary] Add gradient-based attribution methods #122

Comments

gsarti commented Mar 3, 2022 • edited

🚀 Feature Request

Footnotes

saxenarohit commented Nov 20, 2023

gsarti commented Dec 6, 2023

gsarti commented Mar 3, 2022 •

edited