Udacity Deep Reinforcement Learning course - Value-based methods - P1 Navigation

This repository contains code that train an agent to solve the environment proposed in the Value-based methods section of the Udacity Deep Reinforcement Learning (DRL) course.

Environment

The environment consists of a single agent that has to pick up yellow bananas while avoiding blue bananas. A reward of +1 is provided for collecting yellow bananas, and a reward of -1 for blue ones.

The state space has 37 dimensions that describe the agent's speed and object perception around the agent's forward direction. Given this information, the agent has to select one of 4 discrete actions: move forward, move backward, turn left or turn right.

The task is episodic, and in order to solve the environment, the agent must get an average score of +13 over 100 consecutive episodes.

Getting started

Unity environment

Unity doesn't need to be installed since the environment is already available. It can be downloaded from the following links:

When executing the training script, this path should be referenced with the --env-path argument.

Python dependencies

The project uses Python 3.6 and relies on the Udacity Value Based Methods repository. This repository should be cloned, and the instructions on the README should be followed to install the necessary dependencies.

Instructions

The repository contains 2 scripts under the navigation package: train.py and play.py.

Train

The script train.py can be used to train the agent. It accepts the following arguments:

env-path: path pointing to the Unity Bananas environment
weights-path: path where the agent's NN weights will be stored
episodes: number of episodes the agent should be trained for
time-steps-per-episode: timesteps per episode
eps-start: starting value for epsilon
eps-end: minimum value for epsilon
eps-decay: decay factor for epsilon
gamma: discount rate
learning-rate: agent's NN learning rate
batch-size: size of the agent's experience replay buffer

Example:

python train.py --env-path /home/carlos/cursos/udacity_rl_2023/repos/deep-reinforcement-learning/p1_navigation/Banana_Linux/Banana.x86_64
--weights-path /home/carlos/cursos/udacity_rl_2023/projects/drl_p1_navigation/weights/agent_weights.pth
-- episodes 600

Play

A trained agent can be used to play! To do so, the play.py script can be used, providing the Unity environment and the agent's weights paths:

python play.py --env-path /home/carlos/cursos/udacity_rl_2023/repos/deep-reinforcement-learning/p1_navigation/Banana_Linux/Banana.x86_64
--weights-path /home/carlos/cursos/udacity_rl_2023/projects/drl_p1_navigation/weights/agent_weights.pth

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
ipynb		ipynb
navigation		navigation
weights		weights
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ipynb

ipynb

navigation

navigation

weights

weights

.gitignore

.gitignore

README.md

README.md

setup.py

setup.py

Repository files navigation

Udacity Deep Reinforcement Learning course - Value-based methods - P1 Navigation

Environment

Getting started

Unity environment

Python dependencies

Instructions

Train

Play

About

Releases

Packages

Languages

gscharly/drl_p1_navigation

Folders and files

Latest commit

History

Repository files navigation

Udacity Deep Reinforcement Learning course - Value-based methods - P1 Navigation

Environment

Getting started

Unity environment

Python dependencies

Instructions

Train

Play

About

Topics

Resources

Stars

Watchers

Forks

Languages