Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it normal that attribution takes multiple seconds per text, even on a GPU? #124

Open
MoritzLaurer opened this issue Jan 31, 2023 · 1 comment

Comments

@MoritzLaurer
Copy link

Really like you package, thanks a lot for the clean implementation!

I'm trying to get the attributions for each text in a large corpus (10k++ texts) on a google colab GPU. The speed I'm used to on google colab (T4 GPU) is maybe several dozen texts per second (with 16-32 batch size) during inference and during training a few batches (e.g. 32 batch size) per second. For example, when I train a deberta-xsmall model I get 'train_steps_per_second': 6.121 for 32 batch size per step.

I don't have much experience with attribution methods, but I'm surprised that the attribution seems extremely slow, also on a GPU. Based on #60 I have verified that the explainer runs on a gpu correctly with cls_explainer.device.

Despite being on a GPU, the code below only runs with around 2.6 seconds per iteration (one iteration is a single text truncated to 120 max tokens). This is with deberta-xsmall, so a relatively small model.

My question: is it to be expected that a T4 GPU takes 2.6 seconds per text?
If not, do you see something in the code below that I'm doing wrong? (I imagine that I can increase speed by increasing internal_batch_size, but I also had surprisingly many cuda memory errors)

from transformers_interpret import SequenceClassificationExplainer
cls_explainer = SequenceClassificationExplainer(
    model,
    tokenizer)

print(cls_explainer.device)

import tqdm
word_attributions_lst = []
for row in tqdm.notebook.tqdm(df_test.iterrows()):
    # calculate word attributions per text
    word_attributions = cls_explainer(row[1]["text_prepared"], internal_batch_size=1, n_steps=30)  # defaults: n_steps=50
    # add predicted and true label to tuple
    word_attributions_w_labels = [attribution_tuple + (row[1]["label_text"],) + (cls_explainer.predicted_class_name,) for attribution_tuple in word_attributions]
    word_attributions_lst.append(word_attributions_w_labels)
@MoritzLaurer
Copy link
Author

small update: saw in the Captum docs that they often use model.eval() and model.zero_grad() before attribution. I tried doing this, but also didn't really help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant