Skip to content

tsujuifu/code_ssi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[NAACL'21 (Short)] Semi-Supervised Policy Initialization for Playing Games with Language Hints

An implementation of SSI

Paper | Slide | Video

Overview

SSI is an implementation of
"Semi-Supervised Policy Initialization for Playing Games with Language Hints"
Tsu-Jui Fu and William Yang Wang
in North American Chapter of the Association for Computational Linguistics (NAACL) 2021 (Short)

First, the hint module H generates possible hints l for random states s. With s, the policy module P rollouts and step actions a. Then, the reward module R updates P based on the relevance between a and l. With different s, P has the opportunity to learn from various possible hints, and finally serves as a better-initialized policy.

Requirement

This code is implemented under Python2, PyTorch, and Tensorflow.
Following libraries are also required:

Usage

  • Semi-Supervised Initialization (SSI)
python rl/ssi.py --lang_coeff=1.0 --lang_enc=onehot --model_dir=./learn_model
  • Task Training
wget http://www.cs.utexas.edu/~pgoyal/ijcai19/train_lang_data.pkl -O ./data/train_lang_data.pkl
wget http://www.cs.utexas.edu/~pgoyal/ijcai19/test_lang_data.pkl -O ./data/test_lang_data.pkl
python rl/main.py --expt_id=ID_EXPT --descr_id=ID_DESCR --lang_coeff=1.0 --lang_enc=onehot --model_dir=./learn_model

Citation

@inproceedings{fu2021ssi, 
  author = {Tsu-Jui Fu and William Yang Wang}, 
  title = {{Semi-Supervised Policy Initialization for Playing Games with Language Hints}}, 
  booktitle = {North American Chapter of the Association for Computational Linguistics (NAACL)}, 
  year = {2021} 
}

Acknowledgement