Skip to content

Assigment 2 for Course L.EIC029 Artificial Intelligence, FEUP LEIC 3rd Year 2nd Semester

License

Notifications You must be signed in to change notification settings

FabioMiguel2000/LOA-feat.Reinforcement-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lines of Actions Using RL

Pygame StableBaselines OpenAIGym

This project was developed during the Artificial Intelligence Course, at FEUP. A simplified version of the game Lines of Actions is solved using reinforcement learning.

Project Grade: 19.5/20

Random Agent (Before Training) Trained Agent using TRPO

Installation and prerequisite

  1. Install Python3, see official website
  2. It is recommended to run in a conda environment, our advice is to use Miniconda
  3. After installing Python3, run the following command to install the necessary libraries:
pip install -r requirements.txt
  1. [OPTIONAL] It may appear an error when installing/importing tensorBoard, complaining about the protobuf version, if so run the following command to fix the issue:
pip install protobuf~=3.19.0

How to run

Using the Command Line, for Windows users, inside the /src directory:

python main.py [--board=BOARD_SIZE]

Using the Command Line, for Linux or MacOS users, inside the /src directory:

python3 main.py [--board=BOARD_SIZE]

Options:


        [--board=BOARD_SIZE] options:
                --board=4 : For 4x4 Board Size
                --board=5 : For 5x5 Board Size
                --board=6 : For 6x6 Board Size

                default= --board=5

        
        For Example: 

            python3 main.py
                # or
            python3 main.py --board=4   
                                       

Guide

  • After installing the prerequisites, running the command shown above:
    • The following 3 RL Models will be trained using a default TIMESTEP=15000 (can be modified in the main.py by changing the TIMESTEPS variable):
      1. Proximal Policy Optimization (PPO)
      2. Advantage Actor Critic (A2C)
      3. Trust Region Policy Optimization (TRPO)
    • Immediately after the training, these Models will be executed by an agent in order, an UI window will pop up showing the moves chosen
    • The terminal will also output the detail of the actions, rewards and observations of the system
  • To view graphical elements (graphs, plots) of the trained model, run the command:
tensorboard --logdir=logs 
  • Open a browser, and head to http://localhost:6006 (the port number may vary, see detail on the terminal)

Group Members

About

Assigment 2 for Course L.EIC029 Artificial Intelligence, FEUP LEIC 3rd Year 2nd Semester

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages