Muzero

Pytorch Implementation of MuZero for Gymnasium environment. It should support any Discrete ,Box or Box2D configuration for the observation space and action space.

I documented the code as closely as possible next to the Muzero paper. ( You will find two types of comments: self explain variables and comments within the code)

P.S. Gymnasium class Discrete, Box, and Box2D refer to different types of observation or action spaces in a simulation. Discrete refers to a 1D array of integers, Box refers to a 1D array of floating point numbers, and Box2D refers to a multi-dimensional array of integers or floating point numbers.

MuZero -> MuZero Unplugged -> Stochastic MuZero

Getting started

Local Installation

PIP dependency : requirement.txt

git clone https://github.com/DHDev0/Muzero.git

cd Muzero

pip install -r requirements.txt

If you experience some difficulty refer to the first cell Tutorial or use the dockerfile.

Docker

Build image: (building time: 22 min , memory consumption: 8.75 GB)

docker build -t muzero .

(do not forget the ending dot)

Start container:

docker run --cpus 2 --gpus 1 -p 8888:8888 muzero
#or
docker run --cpus 2 --gpus 1 --memory 2000M -p 8888:8888 muzero
#or
docker run --cpus 2 --gpus 1 --memory 2000M -p 8888:8888 --storage-opt size=15g muzero

The docker run will start a jupyter lab on https://localhost:8888//lab?token=token (you need the token) with all the necessary dependency for cpu and gpu(Nvidia) compute.

Option meaning:
--cpus 2 -> Number of allocated (2) cpu core
--gpus 1 -> Number of allocated (1) gpu
--storage-opt size=15gb -> Allocated storage capacity 15gb (not working with windows WSL)
--memory 2000M -> Allocated RAM capacity of 2GB
-p 8888:8888 -> open port 8888 for jupyter lab (default port of the Dockerfile)

Stop the container:

docker stop $(docker ps -q --filter ancestor=muzero)

Delete the container:

docker rmi -f muzero

Dependency

Language :

Python 3.8 to 3.10 (bound by the retro compatibility of Ray and Pytorch)

Library :

torch 1.13.0
torchvision 0.14.0
ray 2.0.1
gymnasium 0.27.0
matplotlib >=3.0
numpy 1.21.5

More details at: requirement.txt

Usage

Jupyter Notebook

For practical example, you can use the Tutorial.

CLI

Set your config file (example): https://github.com/DHDev0/Muzero/blob/main/config/

First and foremost cd to the project folder:

cd Muzero

Training :

python muzero_cli.py train config/experiment_133_config.json

Training with report

python muzero_cli.py train report config/experiment_133_config.json

Inference (play game with specific model) :

python muzero_cli.py train play config/experiment_133_config.json

Training and Inference :

python muzero_cli.py train play config/experiment_133_config.json

Benchmark model :

python muzero_cli.py benchmark config/experiment_133_config.json

Training + Report + Inference + Benchmark :

python muzero_cli.py train report play benchmark play config/experiment_133_config.json

Features

[x] implemented , [ ] unimplemented

How to make your own custom gym environment?

Refer to the Gym documentation

You will be able to call your custom gym environment in muzero after you register it in gym.

Authors

Daniel Derycke

Subjects

Deep reinforcement learning

License

GPL-3.0 license

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
config		config
model_checkpoint		model_checkpoint
report		report
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
game.py		game.py
monte_carlo_tree_search.py		monte_carlo_tree_search.py
muzero_cli.py		muzero_cli.py
muzero_model.py		muzero_model.py
neural_network_lstm_model.py		neural_network_lstm_model.py
neural_network_mlp_model.py		neural_network_mlp_model.py
neural_network_transformer_decoder_model.py		neural_network_transformer_decoder_model.py
neural_network_vision_conv_lstm_model.py		neural_network_vision_conv_lstm_model.py
neural_network_vision_model.py		neural_network_vision_model.py
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
self_play.py		self_play.py
tutorial.ipynb		tutorial.ipynb

License

DHDev0/Muzero

Folders and files

Latest commit

History

Repository files navigation

Muzero

Table of contents

Getting started

Local Installation

Docker

Dependency

Usage

Jupyter Notebook

CLI

Features

How to make your own custom gym environment?

Authors

Subjects

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages