GitHub - mohitpandey92/k_arm_bandit: A simple exercise in reinforcement learning

Here we simulate k-arm bandit problem, which is one of the basic reinforcement learning problems. It's a simple excercise which illustrates the exploration-exploitation tradeoff. We experimented training our agent with greedy (which always maximizes immediate reward) and epsilon-greedy (which mostly maximizes immediate reward but also occasionaly takes risk for exploring).

This particular excercise was taken from a textbook by Sutton and Barto.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
RL_ppt		RL_ppt
papers		papers
RL ppt.pdf		RL ppt.pdf
ReadMe.md		ReadMe.md
k_arm_bandit.ipynb		k_arm_bandit.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RL_ppt

RL_ppt

papers

papers

RL ppt.pdf

RL ppt.pdf

ReadMe.md

ReadMe.md

k_arm_bandit.ipynb

k_arm_bandit.ipynb

Repository files navigation

About

Releases

Packages

Languages

mohitpandey92/k_arm_bandit

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Languages