Skip to content

Rmko4/RL-Catch-Value-Based

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Value-Based Reinforcement Learning

An implementation of the following methods: Deep Q-Network (DQN), Double Deep Q-Network (DDQN), Dueling Architecture, Deep Quality-Value (DQV), and DQV-max. The implementation is based on the following papers:

The methods are trained and evaluated on the Catch game.

Example of trained agent

output

Running the code

Installation

To install all dependencies, run the following command:

pip install -r requirements.txt

Training

To train the agent, run the following command:

python source/train_agent.py [Training Options]

Training Options:

  • --run_name (str): Name of the run.
  • --algorithm ({DQN,Dueling_architecture,DQV,DQV_max}) : Type of algorithm to use for training.

  • --log_video: Whether to log video of agent's performance.

  • --max_epochs (int): Maximum number of steps to train for.
  • --batch_size (int): Batch size for training.
  • --batches_per_step (int): Number of batches to sample from replay buffer per agent step.
  • --optimizer ({Adam,RMSprop,SGD}): Optimizer to use for training.
  • --learning_rate (float): Learning rate for training.
  • --gamma (float): Discount factor.
  • --epsilon_start (float): Initial epsilon.
  • --epsilon_end (float): Final epsilon.
  • --epsilon_decay_rate (int): Number of steps to decay epsilon over.
  • --buffer_capacity (int): Capacity of replay buffer.
  • --replay_warmup_steps (int): Number of steps to warm up the replay buffer.

  • --target_net_update_freq (int): Number of steps between target network updates.
  • --soft_update_tau (float): Tau for soft target network updates.
  • --double_q_learning: Whether to use double Q-learning.

  • --hidden_size (int): Number of hidden units in the feedforward network.
  • --n_filters (int): Number of filters in the convolutional network.

  • --prioritized_replay: Whether to use prioritized replay.
  • --prioritized_replay_alpha (float): Alpha parameter for prioritized replay.
  • --prioritized_replay_beta (float): Beta parameter for prioritized replay.

About

Deep Reinforcement Learning: Value-Based methods. An implementation of DQN, DDQN, Dueling Architectures, DQV, DQV-Max on the PyTorch Lightning framework.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published