Automated explanations

Explaining black box text modules in natural language with language models (arXiv 2023)

This repo contains code to reproduce the experiments in the SASC paper. SASC takes in a text module and produces a natural explanation for it that describes what it types of inputs elicit the largest response from the module (see Fig below).

SASC is similar to the nice concurrent paper by OpenAI, but simplifies explanations to describe the function rather than produce token-level activations. This makes it simpler/faster, and makes it more effective at describing semantic functions from limited data (e.g. fMRI voxels) but worse at finding patterns that depend on sequences / ordering.

For a simple scikit-learn interface to use SASC, use the imodelsX library. Install with pip install imodelsx then the below shows a quickstart example.

from imodelsx import explain_module_sasc
# a toy module that responds to the length of a string
mod = lambda str_list: np.array([len(s) for s in str_list])

# a toy dataset where the longest strings are animals
text_str_list = ["red", "blue", "x", "1", "2", "hippopotamus", "elephant", "rhinoceros"]
explanation_dict = explain_module_sasc(
    text_str_list,
    mod,
    ngrams=1,
)

Reference

See related fMRI experiments
Built from this template

@misc{singh2023explaining,
      title={Explaining black box text modules in natural language with language models}, 
      author={Chandan Singh and Aliyah R. Hsu and Richard Antonello and Shailee Jain and Alexander G. Huth and Bin Yu and Jianfeng Gao},
      year={2023},
      eprint={2305.09863},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 319 Commits
docs		docs
experiments		experiments
notebooks_sasc		notebooks_sasc
notebooks_stories		notebooks_stories
results		results
sasc		sasc
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
SECURITY.md		SECURITY.md
readme.md		readme.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

experiments

experiments

notebooks_sasc

notebooks_sasc

notebooks_stories

notebooks_stories

results

results

sasc

sasc

scripts

scripts

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

SECURITY.md

SECURITY.md

readme.md

readme.md

setup.py

setup.py

Repository files navigation

Automated explanations

Reference

About

Releases

Packages

Contributors 3

Languages

License

microsoft/automated-explanations

Folders and files

Latest commit

History

Repository files navigation

Automated explanations

Reference

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages