Episodic-Backward-Update

Lasagne/Theano-based implementation of "Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update", NeurIPS 2019.

Episodic Backward Update (EBU) with a constant diffusion factor for the ATARI environment is uploaded.

Dependencies

Numpy
Scipy
Pillow
Matplotlib
Lasagne
ALE
Theano (0.9.0)

Our implementation is based on Shibi He's implementation of Optimality Tightening which is based on Nathan Sprague's implementation of deep Q RL. Please refer to https://github.com/spragunr/deep_q_rl for installing the dependencies.

We ran the code with CUDA 8.0/CUDNN 5.1.5/TITAN Xp.

Major changes from the deep Q RL implementation

ale_agents.py / _do_training : generate temporary target Q table and update
ale_data_set.py / random_episode : sample an episode instead of a minibatch of transitions
ale_experiment.py / run, run_epoch, run_episode : fixed to apply the Nature DQN setting so that each episode is played at most 4,500 steps (18,000 frames or 5 minutes).
launcher.py contains a hyperparameter beta for the diffusion factor

Running

You can train an EBU agent with a constant diffusion factor 0.5 in breakout using random seed 12 on gpu0 as follows. THEANO_FLAGS='device=gpu0, allow_gc=False' python code/run_EBU.py -r 'breakout' --Seed 12 --beta 0.5

By default, it returns the test scores at every 62,500 steps for 40 times (62,500 steps x 4 frames/step x 40 = 10M frames in total).

You may modify the STEPS_PER_EPOCH and EPOCHS parameter in run_EBU.py to change the total number of training steps and the frequency of evaluation.

You will see the process as below if everything runs fine.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
code		code
roms		roms
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

roms

roms

README.md

README.md

Repository files navigation

Episodic-Backward-Update

Dependencies

Major changes from the deep Q RL implementation

Running

About

Releases

Packages

Languages

suyoung-lee/Episodic-Backward-Update

Folders and files

Latest commit

History

Repository files navigation

Episodic-Backward-Update

Dependencies

Major changes from the deep Q RL implementation

Running

About

Resources

Stars

Watchers

Forks

Languages