Super Mario RL Project

Overview

This project focuses on implementing reinforcement learning algorithms to train agents in the Super Mario Gym environment. The goal is to create an intelligent agent that can navigate and complete levels in the Super Mario environment using RL techniques dealing with the sparse environment problem.

Features

Reinforcement Learning Algorithms: Implement and experiment with various RL algorithms such as Sarsa, Double Deep Q-learning, A3C and ICM module.
Super Mario Gym Environment: Utilize the OpenAI Gym environment for Super Mario, providing a simulation for training and evaluating the agent.
Visualization: Include visualizations and graphs to demonstrate the learning progress of the agent over time.

Getting Started

Prerequisites

Python 3
All required packages are listed in the requirements.txt file.

Installation

Clone the repository:

git clone https://github.com/DavideEspositoPelella/SuperMario-RL.git

Navigate the SuperMario-RL folder

cd SuperMario-RL

Install tkinter (required for certain graphical operations in Python):

sudo apt-get install python3.8-tk

Set up a Python Environment:

Using a virtual environment (optional but recommended)

sudo apt install python3.8-venv
python3 -m venv venv
source venv/bin/activate # On Windows use 'venv/Scripts/activate'

Using Conda

conda create -n supermario_rl python=3.8
conda activate supermario_rl

Install dependencies.

pip install --no-cache-dir -r requirements.txt

Executing

You can run the program with various options

python3 main.py [OPTIONS]

Command-Line Arguments:

'-t', '--train': Enable training mode.
'-e', '--evaluate': Enable evaluation mode.
'algorithm ': Specify the algorithm to use. Options are ddqn, ddqn_per, a2c. Default is a2c.
'--episodes' <num_episodes>': Set the number of episodes:
- Default for training is 20000;
'--icm': Specify to use the ICM module. Default is False.
' --log-freq ': Logging frequency (for A2C it is relative to the number of episodes, for DDQN is relative to the global step count). Default is 10.
'--save-freq ': Saving frequency (for A2C it is relative to the number of episodes, for DDQN is relative to the global step count). Default is 100.
'--log-dir ': Directory to save logs. Default is ./logs/.
'--save-dir ': Directory to save trained models. Default is ./trained_models/.
'--model ': Specify if you want to load a specific model to continue the training or to evaluate.
'--tb': Enable tensorboard.

Examples

Run training with default settings

python3 main.py -t

Run training with a specific algorithm and number of episodes (and TensorBoard logging)

python3 main.py -t --episodes 25000 --algorithm a2c --icm --tb

Run evaluation

python3 main.py -e --algorithm a2c --icm --model mario_net.chkpt

Win model

python3 main.py -e --algorithm ddqn --icm --model mario_net.chkpt

Contacts

Davide Esposito Pelella: Davide Esposito Pelella
Paolo Ferretti: pabfr99

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
agents		agents
checkpoints		checkpoints
demo		demo
logs		logs
models		models
util		util
wrappers		wrappers
.gitignore		.gitignore
README.md		README.md
args.py		args.py
config.py		config.py
main.py		main.py
make_env.py		make_env.py
requirements.txt		requirements.txt

DavideEspositoPelella/SuperMario-RL

Folders and files

Latest commit

History

Repository files navigation

Super Mario RL Project

Overview

Features

Getting Started

Prerequisites

Installation

Executing

Examples

Win model

Contacts

About

Topics

Resources

Stars

Watchers

Forks

Languages