Soft Actor-Critic in PyTorch

A PyTorch implementation of Soft Actor-Critic[1,2] with n-step rewards and prioritized experience replay[3].

NOTE

I re-implemented Soft Actor-Critic in discor.pytorch repositry, which is better organized and faster, with DisCor algorithm. Please check it out!!

Requirements

You can install liblaries using pip install -r requirements.txt except mujoco_py.

Note that you need a licence to install mujoco_py. For installation, please follow instructions here.

Examples

You can train Soft Actor-Critic agent like this example here.

python code/main.py \
[--env_id str(default HalfCheetah-v2)] \
[--cuda (optional)] \
[--seed int(default 0)]

If you want to use n-step rewards and prioritized experience replay, set multi_step=5 and per=True in configs.

Results

Results of above example (without n-step rewards nor prioritized experience replay) will be like below, which are comparable (or better) with results of the paper.

References

[1] Haarnoja, Tuomas, et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor." arXiv preprint arXiv:1801.01290 (2018).

[2] Haarnoja, Tuomas, et al. "Soft actor-critic algorithms and applications." arXiv preprint arXiv:1812.05905 (2018).

[3] Schaul, Tom, et al. "Prioritized experience replay." arXiv preprint arXiv:1511.05952 (2015).

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Soft Actor-Critic in PyTorch

Requirements

Examples

Results

References

About

Releases

Packages

Languages

License

toshikwa/soft-actor-critic.pytorch

Folders and files

Latest commit

History

Repository files navigation

Soft Actor-Critic in PyTorch

Requirements

Examples

Results

References

About

Resources

License

Stars

Watchers

Forks

Languages