Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get word importance #510

Open
nochimake opened this issue Feb 18, 2024 · 1 comment
Open

How to get word importance #510

nochimake opened this issue Feb 18, 2024 · 1 comment

Comments

@nochimake
Copy link

I have a text sentiment polarity prediction model, roughly structured as RoBERTa + CNN. Now, I want to use InterpretML to explain its prediction results. My code is as follows:

from interpret.glassbox import ExplainableBoostingClassifier
import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences

def analysis_interpret(target_name: str, text_list: list, sentiment_list: list):
    ebm = ExplainableBoostingClassifier()
    data_generator = DataGenerator(text_list, sentiment_list)
    X_train = [np.ravel(arr) for arr in data_generator.input_ids]
    X_train = pad_sequences(X_train)
    X_train = np.array(X_train)
    y_train = sentiment_list
    ebm.fit(X_train, y_train)

    ebm_local = ebm.explain_local(X_train, y_train)

Where DataGenerator is the text processing class for my model. Here, I'm temporarily using RoBERTa's tokenizer to map the text to the required token IDs for modeling. y_train represents the labels predicted by my model. After the statement ebm_local = ebm.explain_local(X_train, y_train), how can I obtain the importance of each word? I have seen people using the ebm_local.get_local_importance_dict() method, but I can't find this method in version 0.5.1.

@paulbkoch
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants