Deep-RL

A repo for my deep RL agents from the course CS698R in IITK !

pip install -e DRLagents

Agents

The agents implemented here are:

NFQ
DQN
Double DQN (DDQN)
Dueling DDQN (D3QN)
Dueling DDQN with Prioritized Experience Replay (D3QN_PER)

from DRLagents import NFQ, DQN, DDQN, D3QN, D3QN_PER

For each agent you can optionally define a function to make the observable state from the observation and info returned by env.step(), the parameter stateFn allows this functionality. Heads-up, you will also need to handle the case when info = None because env.reset() returns only the observation.

Other utils

Replay Buffer:
```
from DRLagents.replaybuffer import ReplayBuffer
```
- Implements both usual experience replay (as in DQN)
- And the Prioritized Experience Replay (PER)
- Can set Prioritized mode by inserting 'PER' in the bufferType example:
```
ReplayBuffer(bufferSize, bufferType = 'PER-D3QN', priority_alpha=alpha, priority_beta=beta, priority_beta_rate=beta_rate)
```

Exploration Strategies:

from DRLagents.exploration_strategies import selectEpsilonGreedyAction, selectGreedyAction, selectSoftMaxAction

Greedy Exploration
Epsilon-Greedy Exploration
Softmax Exploration

Decay Wrapper:
```
from DRLagents.exploration_strategies import decayWrapper
```
- Allows to decay the epsion (in case of epsilon-greedy strategy)
- or the temperature (in case of softmax strategy) parameters.

Examples:

The following code snippet shows how you train a deep network (torch's nn.Module) using this package. To see the full code read DQNexample.py

import gym
...
from DRLagents import D3QN
from DRLagents.exploration_strategies import decayWrapper, selectEpsilonGreedyAction, selectGreedyAction

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

env = gym.make('CartPole-v0')
explorationStrategyTrain = decayWrapper(selectEpsilonGreedyAction, 0.5, 0.05, 500, device=device)

DQNagent = DQN(Qnetwork, env, seed=0, gamma=0.8, epochs=10, bufferSize=10000, batchSize=512, 
                optimizerFn=optim.Adam, optimizerLR=0.001, MAX_TRAIN_EPISODES=800, MAX_EVAL_EPISODES=1, 
                explorationStrategyTrainFn= explorationStrategyTrain, explorationStrategyEvalFn= selectGreedyAction, 
                updateFrequency=5, device=device)
                
train_stats = DQNagent.trainAgent() # train the agent
eval_rewards = DQNagent.evaluateAgent()

There are files with the name structure as example.py, these are the examples of using the package for each type of agent.

To know more about the inputs (and the documentation) please read the class descriptions. The documentation is comming in the Readme in a while...

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
CS698R_DRLagents		CS698R_DRLagents
D3QN_PERexample.py		D3QN_PERexample.py
D3QNexample.py		D3QNexample.py
DDQNexample.py		DDQNexample.py
DQNexample.py		DQNexample.py
NFQexample.py		NFQexample.py
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CS698R_DRLagents

CS698R_DRLagents

D3QN_PERexample.py

D3QN_PERexample.py

D3QNexample.py

D3QNexample.py

DDQNexample.py

DDQNexample.py

DQNexample.py

DQNexample.py

NFQexample.py

NFQexample.py

README.md

README.md

setup.py

setup.py

Repository files navigation

Deep-RL

Agents

Other utils

Examples:

About

Releases

Packages

Languages

DibyojyotiS/CS698R_DRLagents

Folders and files

Latest commit

History

Repository files navigation

Deep-RL

Agents

Other utils

Examples:

About

Topics

Resources

Stars

Watchers

Forks

Languages