Skip to content

flexi-framework/relexi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

license doi

About Relexi

Relexi is a Reinforcement Learning (RL) framework developed for the high-order HPC flow solver FLEXI. However, Relexi is developed with modularity in mind and allows to used with other HPC solvers as well. Relexi builds upon TensorFlow and its RL extension TF-Agents. For the efficient communication, data handling and the managment of the simulations runs on HPC systems, Relexi uses the SmartSim package with its SmartRedis communication clients. For details on its scaling behavior, suitability for HPC and for use cases, please see

This is a scientific project. If you use Relexi or find it helpful, please cite the project using a suitable reference from the list above referring to either the general Relexi project, its HPC aspects or its application for scientific modeling tasks, respectively.

Documentation

The documentation of Relexi is built via pdoc, which can be installed via

pip install pdoc

Next, change into the docs folder and build the documentation via

cd docs
bash build_docs.sh

Open the resulting relexi.html with your browser.

Testing

A suite of unit tests is implemented for Relexi using the pytest testing environment. To run the tests, simply execute in the root directory

pytest

Installation

The following quick start details a standard installation of the Relexi framework.

Dependencies

Relexi has a variety of dependencies. The main dependencies of Relexi are listed in the following with their supported version.

Package Version Note
Python ≥3.9
TensorFlow 2.9.0 - 2.15.1
TF-Agents ≥0.13
SmartSim 0.4.0 - 0.6.2
SmartRedis ≥0.4.1
CMake ≥3.0
Make ≥4.0
gcc-fortran ≥9.4 gcc 10 not supported! (gcc ≥11 is fine)
gcc ≥9.4
gcc-c++ ≥9.4

Be ware that the major dependencies (SmartSim, TensorFlow, FLEXI) might have a more expansive dependency tree, for which we refer the user to the corresponding documentations for details.

Prerequisites

Open a terminal and change into the directory where you want to install Relexi and its dependecies For convenience, save the current directory with

ROOTDIR=$(pwd)

It is highly recommended to use some form of virtual environment for the installation. You can use any tool you like, we use virtualenv which can be installed with

python3 -m pip install virtualenv

Create and activate a new environment with

python3 -m virtualenv env_relexi
source env_relexi/bin/activate

Install the necessary dependencies

python3 -m pip install tensorflow smartredis cmake tf-agents pyyaml matplotlib

Install SmartSim

Install SmartSim in a supported version ranging from 0.4.0 to 0.6.2. The following commands install the latest supported version.

pip install smartsim==0.6.2
smart clobber
smart clean
smart build --no_tf --no_pt -v
SMARTSIM_DIR=$(smart site)
export PATH=$PATH:$SMARTSIM_DIR/_core/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${SMARTSIM_DIR}/_core/lib

Install FLEXI

Clone the required version of FLEXI from GitHub and build it with the standard compile flags

cd $ROOTDIR
git clone https://github.com/flexi-framework/flexi-extensions.git
cd flexi-extensions
git checkout smartsim
mkdir -p build && cd build
cmake .. -DLIBS_BUILD_HDF5=ON -DLIBS_USE_MPI=OFF -DLIBS_BUILD_SMARTREDIS=ON -DLIBS_USE_SMARTREDIS=ON -DLIBS_USE_FFTW=ON -DPOSTI=OFF -DFLEXI_TESTCASE=hit -DFLEXI_NODETYPE=GAUSS-LOBATTO -DFLEXI_SPLIT_DG=ON -DFLEXI_EDDYVISCOSITY=ON
make -j

This compiles FLEXI without MPI and thus in its serial version. To enable MPI or to change the configuration of FLEXI, please see the official documentation of FLEXI.

Install Relexi

Finally, clone the Relexi repository. No additional installation steps are required.

cd $ROOTDIR
git clone https://github.com/flexi-framework/relexi.git

Running the Code

Relexi comes with some example setups to test if the code runs. Enter the directory of the first test case with

cd relexi/examples/HIT_24_DOF/

Open the prm.yaml file in a text editor of your choice. If you have installed the flexi binary not in the default path, adapt the path of the executable under library_path accordingly. Then you can start the training process with

python3 ../../src/train.py prm.yaml

You may also set the number of parallel environments by setting num_parallel_environments according to your local hardware resources. You can change the number of processors used for each FLEXI environment by setting num_procs_per_environment to the appropriate value. Please be aware that for using FLEXI in parallel, i.e. with more than 1 CPU core per environment, it has to be compiled with MPI. Please refer to the FLEXI documentation for details.

Results

To visualize the results, Relexi uses the TensorBoard suite. After running the code, Relexi should create a directory logs, where the model, training checkpoints and the training metrics are saved. Open it with

tensorboard --logdir logs/

Tensorboard then provides a URL that can be opened in the Browser. If the training is performed on a remote server, the port where TensorBoard sends its data has to be redirected to your local machine. If you use ssh to connect to the server, you can redirect the standard TensorBoard port (6006) with

ssh -L 6006:127.0.0.1:6006 your_remote_server

About

This is the repository for Relexi, a scalable reinforcement learning framework for CFD on HPC systems

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages