Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

Models and Algorithms

See files under walk_the_blocks/BlockWorldRoboticAgent/srcs/

learn_by_ppo.py run this file for training, you can change the schedule mechanism in the function ppo_update(), these are the options:
- do imitation every 50
- do imitation based on rules
- imitation 1 epoch and then RL 1 epoch
example: python learn_by_ppo.py -lr 0.0001 -max_epochs 2 -entropy_coef 0.05
policy_model.py the network achitecture and loss functions:
- PPO Loss
- Supervised Loss
- Advantage Actor-Critic Loss

Instructions

For the usage of the Block-world environment, please refer to https://github.com/clic-lab/blocks

Train the RL agents

S-REIN *

If you use our code in your own research, please cite the following paper

@article{xiong2018scheduled,
  title={Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents},
  author={Xiong, Wenhan and Guo, Xiaoxiao and Yu, Mo and Chang, Shiyu and Zhou, Bowen and Wang, William Yang},
  journal={arXiv preprint arXiv:1806.06187},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
BlockWorldRoboticAgent		BlockWorldRoboticAgent
BlockWorldSimulator		BlockWorldSimulator
examples		examples
obj/Debug		obj/Debug
simulator2		simulator2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BlockWorldRoboticAgent

BlockWorldRoboticAgent

BlockWorldSimulator

BlockWorldSimulator

examples

examples

obj/Debug

obj/Debug

simulator2

simulator2

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

Models and Algorithms

Instructions

Train the RL agents

If you use our code in your own research, please cite the following paper

About

Releases

Packages

Languages

License

xwhan/walk_the_blocks

Folders and files

Latest commit

History

Repository files navigation

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

Models and Algorithms

Instructions

Train the RL agents

If you use our code in your own research, please cite the following paper

About

Topics

Resources

License

Stars

Watchers

Forks

Languages