Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Summary] Add gradient-based attribution methods #122

Open
gsarti opened this issue Mar 3, 2022 · 2 comments
Open

[Summary] Add gradient-based attribution methods #122

gsarti opened this issue Mar 3, 2022 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks

Comments

@gsarti
Copy link
Member

gsarti commented Mar 3, 2022

πŸš€ Feature Request

The following is a non-exhaustive list of gradient-based feature attribution methods that could be added to the library:

Method name Source In Captum Code implementation Status
DeepLiftSHAP - βœ… pytorch/captum
GradientSHAP1 Lundberg and Lee '17 βœ… pytorch/captum
Guided Backprop Springenberg et al. '15 βœ… pytorch/captum
LRP 2 Bach et al. '15 βœ… pytorch/captum
Guided Integrated Gradients Kapishnikov et al. '21 PAIR-code/saliency
Projected Gradient Descent (PGD) 3 Madry et al. '18, Yin et al. '22 uclanlp/NLP-Interpretation-Faithfulness
Sequential Integrated Gradients Enguehard '23 josephenguehard/time_interpret
Greedy PIG 4 Axiotis et al. '23
AttnLRP Achtibat et al. '24 rachtibat/LRP-for-Transformers

Notes:

  1. The Deconvolution method can also be added, but it seems to perform the same procedure as Guided Backprop, so it wasn't included to avoid deduplication.

Footnotes

  1. The method was already present in inseq but was removed due to instability in the single example vs. batched setting, reintroducing it will need this problem to be fixed.
  2. Custom rules for the supported architectures need to be defined in order to adapt the LRP attribution method to our use-case. An existing implementation of LRP rules for Transformer models in Tensorflow is available here: [lena-voita/the-story-of-heads](https://github.com/lena-voita/the-story-of-heads).
  3. The method leverage gradient information to perform adversarial replacement, so its collocation in the gradient-based family should be reviewed.
  4. Similar to Sequential Integrated Gradient, but instead of focusing on one word at a time, at every iteration the top features identified by attribution are fixed (i.e. baseline is set to identity) and the remaining ones are attributed again in the next round.
@gsarti gsarti added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed summary Summarizes multiple sub-tasks labels Mar 3, 2022
@gsarti gsarti added this to the v1.0 milestone Mar 3, 2022
@gsarti gsarti removed this from the Demo Paper Release milestone Oct 26, 2022
@gsarti gsarti pinned this issue Jan 24, 2023
@gsarti gsarti removed the good first issue Good for newcomers label Jul 20, 2023
@saxenarohit
Copy link

Is there a plan to add LRP to inseq?

@gsarti
Copy link
Member Author

gsarti commented Dec 6, 2023

Hi @saxenarohit, in principle the Captum LRP implementation should be directly compatible with Inseq. However, the implementation is very model specific with some notable (and to my knowledge, presently unsolved) issues with skip connections, which are the bread and butter of most transformer architectures used in Inseq (see pytorch/captum#546).

I think in general to proceed with an integration we should make sure that:

  1. at least the majority of Inseq-supported models would be compatible with propagation rules currently supported by Captum; and
  2. the misuse of LRP for unsupported architectures would raise informative errors to prevent involuntary misuse

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks
Projects
None yet
Development

No branches or pull requests

2 participants