Skip to content

spranesh/RL-TicTacToe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tic-Tac Toe
--------------

Software Architecture:

RL-Glue mechanism:
 * "Environment" - Returns a 'state', set of valid actions and a reward.
 * "Agent" - Returns an action
 * Run on a common platform

 * A sample invocation would be 

      ./main.py 100 "OptimalAgent" "TicTacToe:random:RandomAgent" 

   this starts the TicTacToe with the Agent being the OptimalAgent, and the
   opponent is a RandomAgent and is randomly chosen to start first; 

 * Another sample invocation would be

      ./main.py 100 "PolicyGradient" "TicTacToe:false:OptimalAgent" 

   This does the same, with the PolicyGradient as the Agent, and the
   OptimalAgent as the opponent; though now the Agent always starts first.

About

RL agents to play TicTacToe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages