ALTI attribution method #217

gsarti · 2023-08-14T16:24:52Z

Description

This PR introduces the ALTI attribution method, following the notes detailed in #200. The implementation available at mt-upc/logit-explanations was used to set up a functioning template providing access to all internals needed to perform the ALTI computation (in/outs of every module, attentions and hidden states, plus a config associated to the AttributionModel for accessing specific modules in the Transformer block). The core logic of the method is in inseq/attr/feat/ops/alti.py.

TODOs:

Adapt the core computations from get_logit_contributions specific to the production of ALTI contributions (pre-rollout, as this will be handled separately post-attribution via ad-hoc aggregator classes)
Adapt the ALTI+ logics from transformer-contributions-nmt to the transformers library for encoder-decoder model support.

cc @gegallego @javiferran

gsarti · 2023-08-14T16:27:03Z

inseq/attr/feat/ops/alti.py

+        # TODO: Implement decoder-only and encoder-decoder variants
+        # Resulting tensors (pre-rollout):
+        # `decoder_contributions` is a tensor of ALTI contributions for each token in the target sequence
+        # with shape (batch_size, n_layers, target_seq_len, target_seq_len)
+        decoder_contributions = torch.zeros_like(decoder_self_attentions[:, :, 0, ...])
+        if self.forward_func.is_encoder_decoder:
+            # `encoder_contributions` is a tensor of ALTI contributions for each token in the source sequence
+            # with shape (batch_size, n_layers, source_seq_len, source_seq_len)
+            encoder_contributions = torch.zeros_like(encoder_self_attentions[:, :, 0, ...])
+            # `cross_contributions` is a tensor of ALTI contributions of shape
+            # (batch_size, n_layers, target_seq_len, source_seq_len)
+            cross_contributions = torch.zeros_like(cross_attentions[:, :, 0, ...])
+        else:
+            encoder_contributions = None
+            cross_contributions = None


Currently the ALTI method is not implemented, but tensors matching the desired output shape are provided in the method where they should be produced (i.e. running the method works, but produces 0s instead of actual scores).
The method can be ran as

import inseq model = inseq.load_model("gpt2", "alti") out = model.attribute("This is a test.") out

* origin/main: Attributed behavior for contrastive step functions (#228) Fix command for installing pre-commit hooks. (#229) Remove `max_input_length` from `model.encode` (#227) Migrate to `ruff format` (#225) Remove contrast_target_prefixes from contrastive step functions (#224) Step functions fixes, add `in_context_pvi` (#223) Format fixes, add Attanasio et al. (2023) to readme Add Sequential IG method (#222) Fix LIME and Occlusion outputs (#220) Update citation information Bump dependencies Add end_pos for contrast_targets_alignments Fix dummy output viz in console Minor fixes

* origin: Fix python version in rtd config Bump dependencies, update version and readme (#236) Remove jax group from main Installation section. Add optional jax group to enforce compatible jaxlib version. Minor fixes (#233) Update tutorial to contrastive attribution changes (#231)

Initial template for ALTI attribution

d30ad73

gsarti commented Aug 14, 2023

View reviewed changes

gsarti added 4 commits October 30, 2023 11:52

First draft working (dec-only)

15cece1

Merge remote origin/main into alti

a704f97

gsarti mentioned this pull request Apr 11, 2024

Add ALTI+ implementation #200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ALTI attribution method #217

ALTI attribution method #217

gsarti commented Aug 14, 2023

gsarti Aug 14, 2023

ALTI attribution method #217

Are you sure you want to change the base?

ALTI attribution method #217

Conversation

gsarti commented Aug 14, 2023

Description

gsarti Aug 14, 2023

Choose a reason for hiding this comment