Maintain an environmental exploration map & Update by Bayesian probability **For Autonomous Vehicle**
-
Updated
Apr 24, 2018 - C++
Maintain an environmental exploration map & Update by Bayesian probability **For Autonomous Vehicle**
This is the pytorch implementation of ICML 2018 paper - Self-Imitation Learning.
Active versus Passive exploration
Research Thesis - Reinforcement Learning
Personalized and Interactive Music Recommendation with Bandit approach
Action elimination for multi-armed bandits
Classic papers and resources on recommendation
over-parameterization = exploration ?
OpenAI, gym environment implementation
Repository Containing Comparison of two methods for dealing with Exploration-Exploitation dilemma for MultiArmed Bandits
Focuses on Reinforcement Learning related concepts, use cases, and learning approaches
OSPO is a novel metaheuristic algorithm which has the potential to solve different kinds of problems with promising performance.
This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.
A companion repository for 'Inverse Bayesian Optimization: Learning Human Acquisition Functions in an Exploration vs Exploitation Search Task'
This project focuses on comparing different Reinforcement Learning Algorithms, including monte-carlo, q-learning, lambda q-learning epsilon-greedy variations, etc.
A short implementation of bandit algorithms - ETC, UCB, MOSS and KL-UCB
A simple exercise in reinforcement learning
Add a description, image, and links to the exploration-exploitation topic page so that developers can more easily learn about it.
To associate your repository with the exploration-exploitation topic, visit your repo's landing page and select "manage topics."