Qlearning_vs_SARSA

Code that produces a comparison between two different learning agents in a classic gridworld game. One that uses the off-policy approach of Q-learning, and the other which uses the on-policy State Action Reward State Action (SARSA) approach.

Image of grid used in game:

Code should produce graphs like the ones below which show the average rewards for the agents over 500 epochs for varying levels of exporation (epsilon value).

The image below compares the two agents for an epsilon value of 0.1:

The image below compares the two agents for an epsilon value of 0.25:

The image below compares the two agents for an epsilon value of 0.75:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CA1_Q5.py		CA1_Q5.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CA1_Q5.py

CA1_Q5.py

README.md

README.md

Repository files navigation

Qlearning_vs_SARSA

About

Releases

Packages

Languages

afrisby21/Qlearning_vs_SARSA

Folders and files

Latest commit

History

CA1_Q5.py

CA1_Q5.py

README.md

README.md

Repository files navigation

Qlearning_vs_SARSA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages