Uctoth

Othello (Reversi) game engine written in CoffeeScript 2. Demo page is here.

Experimental

This code is mostly experimental, not intended for reuse. Reusing may need some work. Read the source if you are interested.

Features

CoffeeScript 2

Works both on Node.js (with ES6+ features) and on browsers (with help from babel and webpack).

Now the most of CPU-intensive parts, endgame solving, game tree search and learning, are also written in C. They work as add-on of Node.js and also as WebAssembly on browsers (except learning). CoffeeScript implementations still exist and used when add-on/wasm is not available.

Self learning

Starting from random plays, strengthens itself by loops of self playing and machine learning. Uses no existing records of games.

Included weights.json file is a result of learning ~100,000 self-played games.

Game tree search

Employs modified UCT search with static evaluation. Searches 15-25 plies deep maximum on modern JavaScript engines. C version searches 30+ plies deep. Beats my implementation of minimax with NegaScout by 70-80% winning rate.

Static evaluation

Classic pattern-based evaluation method as described in publications by Michael Buro. He is the god of computer Othello for me.

Machine learning

Simple linear regression of ~57,000 sparse features per game position. Optionally supports logistic regression, which predicts probability to win instead of final score. Now using mini-batch with AdaGrad for faster convergence. Also written from scratch in CoffeeScript/C.

Usage

First off run yarn to install dependencies and build Node add-on.

yarn

Most scripts expect weights.json file existing in this directory. Copy from ref directory at first (then build your own).

cp ref/weights.json .

To run scripts written in CoffeeScript 2, you should either install coffeescript globally, or use npx coffee to run.

Bootstrapping

This section describes how to train your own weights.json from scratch.

First off, generate 1,000 randomly played games.

npx coffee selfplay-rnd -R -n 1000 -w 12 -f 10 -b 1000000 --min_col=0

They are random but the last 10 moves are perfectly played. Generated games are stored in book.db (SQLite3 database).

Next, learn the generated games and make your first weights.json.

npx coffee learn

If you can't use Node add-on for some reason, use reg instead.

Now you can remove book.db of random games.

rm book.db*

Copy auto script from samples directory.

cp samples/auto .

You may edit auto as you like. Now you are ready to run automatic self-learning loop.

./auto

It runs regression and 30-game matches against ref/weights.json every 1,000 games generated. Match results are appended to match.log.

Running self-play and learning in parallel

Running self-play and regression simultaneously can utilize multi-core CPUs, thus can speed up the entire self learning process.

Sample scripts are in samples directory, copy them.

cp samples/selfplay-loop samples/reg-loop samples/match-loop .

Edit the scripts as you need. Then run reg-loop in one terminal, selfplay-loop in another, and match-loop in the third. This setup uses 3 CPU threads, but if it isn't enough for your machine, you may run selfplay-loop as many as you want.

The trick is simple. All selfplay scripts watch weights.json to change. When reg-loop finished creating a new weights.json, selfplay scripts exit and invoked again by shell script.

watch script is useful for watching to see if everything is working well.

Automatic tuning of L2 regularization parameter

learn (and its CoffeeScript counter part reg/minibatch) uses L2 regularization to avoid over-fitting. It's important to give optimal parameters of this to build strong weights.

l2tune script is written for this purpose. It uses K-fold cross-validation and optionally actually plays 30-game match to find strongest parameters. The result is written to l2.json and learn reads values from this file if available. These are very time consuming processes. It's recommended to run l2tune without --match periodically while learning, and use l2tune --match for final finish of your weights.json.

If Node add-on is not available, use l2seach and l2opt instead.

Acknowledgement

Sound created by Nobuyuki Honda.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 685 Commits
L2		L2
docs		docs
ext		ext
include		include
learn-ext		learn-ext
ref		ref
samples		samples
tmp		tmp
web		web
.gitignore		.gitignore
README.md		README.md
binding.gyp		binding.gyp
board.coffee		board.coffee
book.coffee		book.coffee
cache.coffee		cache.coffee
coeffs_to_weights		coeffs_to_weights
coeffs_to_weights.coffee		coeffs_to_weights.coffee
db_book_player.coffee		db_book_player.coffee
dump_games		dump_games
encode.coffee		encode.coffee
endgame.coffee		endgame.coffee
l2opt		l2opt
l2search		l2search
l2tune		l2tune
learn		learn
learn-op		learn-op
learn.coffee		learn.coffee
logutil.coffee		logutil.coffee
match		match
match.coffee		match.coffee
match.openings		match.openings
minibatch		minibatch
minibatch.coffee		minibatch.coffee
minmax.coffee		minmax.coffee
mk_openings		mk_openings
node-binding.coffee		node-binding.coffee
op_eval.coffee		op_eval.coffee
openings.json		openings.json
package.json		package.json
pattern.coffee		pattern.coffee
pattern_eval.coffee		pattern_eval.coffee
play		play
player.coffee		player.coffee
reg		reg
rejection_handler.coffee		rejection_handler.coffee
selfplay-bf		selfplay-bf
selfplay-op		selfplay-op
selfplay-rnd		selfplay-rnd
simple_eval.coffee		simple_eval.coffee
static_book_player.coffee		static_book_player.coffee
uct.coffee		uct.coffee
uctopt		uctopt
util.coffee		util.coffee
yarn.lock		yarn.lock

ts1/uctoth

Folders and files

Latest commit

History

Repository files navigation

Uctoth

Experimental

Features

CoffeeScript 2

Self learning

Game tree search

Static evaluation

Machine learning

Usage

Bootstrapping

Running self-play and learning in parallel

Automatic tuning of L2 regularization parameter

Acknowledgement

License

About

Topics

Resources

Stars

Watchers

Forks

Languages