Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALTI attribution method #217

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

ALTI attribution method #217

wants to merge 5 commits into from

Conversation

gsarti
Copy link
Member

@gsarti gsarti commented Aug 14, 2023

Description

This PR introduces the ALTI attribution method, following the notes detailed in #200. The implementation available at mt-upc/logit-explanations was used to set up a functioning template providing access to all internals needed to perform the ALTI computation (in/outs of every module, attentions and hidden states, plus a config associated to the AttributionModel for accessing specific modules in the Transformer block). The core logic of the method is in inseq/attr/feat/ops/alti.py.

TODOs:

  • Adapt the core computations from get_logit_contributions specific to the production of ALTI contributions (pre-rollout, as this will be handled separately post-attribution via ad-hoc aggregator classes)
  • Adapt the ALTI+ logics from transformer-contributions-nmt to the transformers library for encoder-decoder model support.

cc @gegallego @javiferran

Comment on lines 92 to 106
# TODO: Implement decoder-only and encoder-decoder variants
# Resulting tensors (pre-rollout):
# `decoder_contributions` is a tensor of ALTI contributions for each token in the target sequence
# with shape (batch_size, n_layers, target_seq_len, target_seq_len)
decoder_contributions = torch.zeros_like(decoder_self_attentions[:, :, 0, ...])
if self.forward_func.is_encoder_decoder:
# `encoder_contributions` is a tensor of ALTI contributions for each token in the source sequence
# with shape (batch_size, n_layers, source_seq_len, source_seq_len)
encoder_contributions = torch.zeros_like(encoder_self_attentions[:, :, 0, ...])
# `cross_contributions` is a tensor of ALTI contributions of shape
# (batch_size, n_layers, target_seq_len, source_seq_len)
cross_contributions = torch.zeros_like(cross_attentions[:, :, 0, ...])
else:
encoder_contributions = None
cross_contributions = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the ALTI method is not implemented, but tensors matching the desired output shape are provided in the method where they should be produced (i.e. running the method works, but produces 0s instead of actual scores).
The method can be ran as

import inseq

model = inseq.load_model("gpt2", "alti")
out = model.attribute("This is a test.")
out

* origin/main:
  Attributed behavior for contrastive step functions (#228)
  Fix command for installing pre-commit hooks. (#229)
  Remove `max_input_length` from `model.encode` (#227)
  Migrate to `ruff format` (#225)
  Remove contrast_target_prefixes from contrastive step functions (#224)
  Step functions fixes, add `in_context_pvi` (#223)
  Format fixes, add Attanasio et al. (2023) to readme
  Add Sequential IG method (#222)
  Fix LIME and Occlusion outputs (#220)
  Update citation information
  Bump dependencies
  Add end_pos for contrast_targets_alignments
  Fix dummy output viz in console
  Minor fixes
* origin:
  Fix python version in rtd config
  Bump dependencies, update version and readme (#236)
  Remove jax group from main Installation section.
  Add optional jax group to enforce compatible jaxlib version.
  Minor fixes (#233)
  Update tutorial to contrastive attribution changes (#231)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant