Skip to content

AlphaZero for singleplayer environments implemented efficiently using Ray

Notifications You must be signed in to change notification settings

seawee1/efficientalphazero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(Efficient) AlphaZero

An efficient and clean implementation of AlphaZero for single-player domains in PyTorch. The implementation is inspired by the awesome EfficientZero implementation, a derivative work building muZero. Another invaluable resource was the alphazero_singleplayer repository and the corresponding blogpost.

Features

  • Worker parallelization using Ray
  • Model inference parallelism via Batch MCTS
  • AMP support
  • A lot of improvements used in muZero like min-max value scaling and discrete value support for intermediate rewards during MCTS
  • Model pre-training and training data enrichment through demonstrations (similar to AlphaTensor)
  • Easily extendable to new singleplayer environments (just sub-class the BaseConfig)

Setup

Run pip install -r requirements.txt and conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia

Example usage

To train AlphaZero on CartPole, run:

python main.py --env cartpole --opr train,test

About

AlphaZero for singleplayer environments implemented efficiently using Ray

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages