Skip to content

The following project concerns the development of an intelligent agent for the famous game produced by Nintendo Super Mario Bros. More in detail: the goal of this project was to design, implement and train an agent with the Q-learning reinforcement learning algorithm.

Notifications You must be signed in to change notification settings

Alberto-00/Super-Mario-Bros-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Super-Mario-Bros Reinforcement Learning: QL vs Sarsa

The following project concerns the development of an intelligent agent for the famous game produced by Nintendo Super Mario Bros. More in detail: the goal of this project was to design, implement and train an agent with the Q-learning reinforcement learning algorithm. Subsequently, the results of learning with the Q-learning algorithm were compared with the SARSA algorithm. In our case study, other types of learning involving the double Q-learning algorithm, Deep Q-Network (DQN) and Double Deep Q-Network (DDQN). The reason why these different learnings are provided is for performance issues. For more information, read the report written by us.

The parameters and plots of the relevant QL models are located under ./code/Reinforcement_Learning/models, while the parameters and plots of the Sarsa models are located under ./code/Reinforcement_Learning/sarsa/models.

world-1-1-n_stack=4

Requirements (tested)

Module Version
gym 0.25.2
gym-super-mario-bros 7.4.0
nes-py 8.2.1
pyglet 1.5.21
torch 2.1.1
pygame 2.5.2

Gym Environment

We used the gym-super-mario-bros environment. The code can be found in ./code/Reinforcement_Learning/utils/enviroment.py, where we do the setup of the environment. In ./code/Reinforcement_Learning/utils/setup_env.py We assign custom values to the rewards so as to take as many power-ups as possible. Then the agents QL logic can be found in ./code/Reinforcement_Learning/utils/agents, while models and Sarsa agents can be found in ./code/Reinforcement_Learning/sarsa The custom rewards are:

  • time: -0.1, per second that passes
  • death: -100., mario dies
  • extra_life: 100., mario gets an extra life
  • mushroom: 20., mario eats a mushroom to become big
  • flower: 25., mario eats a flower
  • mushroom_hit: -10., mario gets hit while big
  • flower_hit: -15., mario gets hit while fire mario
  • coin: 15., mario gets a coin
  • score: 15., mario hit enemies
  • victory: 1000 mario win

Training & Results

We used the QL, Double QL, Deep QN, Double Deep QN agents together with their respective sarsa agents with epsilon-greedy policy. Each model was trained for 1000 steps and took about 3.5 hours to finish except for DDQN and DDN Sarsa that they was trained for 10.000 steps and took about 13.4 hours.

Here are the results of all the models, specifically we make a comparison between the QL and Sarsa algorithms.

world-1-1-n_stack=1 world-1-1-n_stack=2 world-1-1-n_stack=4
Training steps 10K 10K 10K
Episode score 1723 4100 4320
Agents DDN Sarsa DDN Sarsa DDQN
Completed level? False True True

So, to get more results, we could implement the PPO agorithm in both QL and Sarsa algorithms and make further comparisons in order to figure out which algorithm is best for the super mario bros game.

Author & Contacts

Name Description

Alberto Montefusco


Developer - Alberto-00

Email - a.montefusco28@studenti.unisa.it

LinkedIn - Alberto Montefusco

My WebSite - alberto-00.github.io


Alessandro Aquino


Developer - AlessandroUnisa

Email - a.aquino33@studenti.unisa.it

LinkedIn - Alessandro Aquino


Mattia d'Argenio


Developer - mattiadarg

Email - m.dargenio5@studenti.unisa.it

LinkedIn - Mattia d'Argenio


About

The following project concerns the development of an intelligent agent for the famous game produced by Nintendo Super Mario Bros. More in detail: the goal of this project was to design, implement and train an agent with the Q-learning reinforcement learning algorithm.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages