Learning Rewards from Linguistic Feedback

This repository contains (1) data, (2) model training, and (3) model analysis code to support the paper: https://arxiv.org/abs/2009.14715

Requirements

Install the environment via Conda:

$ conda env create -f environment.yml

Run tests:

$ python -m unittest

Data Exploration

Appendix.pdf contains additional information about models and experiments, including full transcripts from informative teacher-learner pairs.

Provided iPython notebooks under the /notebooks diectory can be used to explore datasets and re-run model evaluation.

To run them:

$ cd notebooks/
$ jupyter lab

Datasets

The human-human and human-agent datasets can be found in notebooks/data/: human_trial_data.json and agent_trial_data.json respectively.

The easiest way to get started with them is to use the aaai_experiment_data_exploration.ipynb notebook.

Training

Training code / scripts are in the aaai_inference_network_training.ipynb notebook. The data augmentation step will cache results in the notebooks/data/ subfolder.

Evaluation

Evaluation code / scripts are in the aaai_model_evaluation.ipynb notebook.

These can be run independently of the training notebook and will use pretrained models. Running it will cache results in the notebooks/data/ subfolder.

Pre-trained Models

Pretrained models are available in the data/model_training_10fold subdirectory. There is one .pt file for each cross-validation split. These models are loaded and used automatically in the aaai_model_evaluation.ipynb notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learning Rewards from Linguistic Feedback

Requirements

Data Exploration

Datasets

Training

Evaluation

Pre-trained Models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learning Rewards from Linguistic Feedback

Requirements

Data Exploration

Datasets

Training

Evaluation

Pre-trained Models