Reinforcement_learning_cpp_parallel

Name is to be determined.

A parallel reinforcement learning framework written in C++

Goal: build a scalable, reliable and well-maintained RL framework in C++ that can be used for engineering research.

see documentation here

For dependence, building and running, see here

alpha 0.1

Version alpha 0.1 implements Policy Proximal Optimization (PPO) learning algorithm only. Basic MPI functions are used for communications between simulation nodes and the learning node. Libtorch is used for neural network training and inference. Our code can be compiled and run on USC CARC.

Using pybind11 to make our code callable in python, using GPU for training and inference, and implement other algorithms are our furture goals.

Developed by Kishore Ganesh, Qiongao Liu, Yusheng Jiao, Chenchen Huang, Haotian Hang as USC CSCI596 course final project

We used our code to trian on a cartpole environment. The learning curve using only one environment node is shown as follows.

The learning curve using three environments running simutaneously is showns as follows.

In the learning curves above, the dots shows the reward of each episode, and the line shows the average reward over 100 episodes.

Build

Download dependence: libtorch and add libtorch to your environment path:

wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip
unzip libtorch-shared-with-deps-latest.zip
export Torch_DIR=</absolute/path/to/libtorch>
echo "export Torch_DIR=</absolute/path/to/libtorch>" >> ~/.bashrc

clang-format is not reqiured, but if you want to use "make format", you need to have it installed on debian system:

sudo apt install clang-format

build our project: First way(using Makefile):

make all

or

make build

or

make debug

Second way(using cmake):

mkdir build
cd build
cmake  ..

cmake --build . --config Release

or

cmake --build . --config Debug

2.1 build on USC CARC

module load gcc/8.3.0 
module load openmpi/4.0.2
module load cmake
export LD_PRELOAD=/spack/apps/gcc/8.3.0/lib64/libstdc++.so.6

build:

cmake  -DCMAKE_C_COMPILER=gcc   -DCMAKE_CXX_COMPILER=g++   ..

running:

mpirun -n 4 ./cpp-rl-training <path/to/config/file> > out

or

mpirun -n 4 ./cpp-rl-training  > out

in which 4 is the amount of nodes you want to use, at least 2 if config file is not specified, the default config file will be used (../config) (> out) means everything which are supposed to be printed in terminal is instead printed in a file called out

3.1 running on CARC use .sl file

Name		Name	Last commit message	Last commit date
Latest commit History 236 Commits
.vscode		.vscode
archived		archived
doc		doc
include		include
results		results
src		src
try_new_feature		try_new_feature
.DS_Store		.DS_Store
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
Build and use note		Build and use note
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config		config
learning_curve_.m		learning_curve_.m
learning_curve_multiple.asv		learning_curve_multiple.asv
learning_curve_multiple.m		learning_curve_multiple.m
try.sl		try.sl

License

haotianh9/Reinforcement_learning_cpp_parallel

Folders and files

Latest commit

History

Repository files navigation

Reinforcement_learning_cpp_parallel

alpha 0.1

Build

About

Topics

Resources

License

Stars

Watchers

Forks

Languages