A multi-armed bandit (MAB) simulation library in Python
-
Updated
May 22, 2024 - Python
A multi-armed bandit (MAB) simulation library in Python
Problem Statement Perform clustering (Hierarchical,K means clustering and DBSCAN) for the airlines data to obtain optimum number of clusters. Content This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas
Deep Recurrent Q-Network with different exploration strategies for self-driving cars (using AirSim)
parameter optimization of a reinforcement learning deep Q network with memory replay buffer using genetic algorithm in the snake game. base code for snake env from codecamp
Playing with Reinforcement Learning
This repository contains Reinforcement Learning course projects...
A content-based music recommendation system, that suggests playlists made from the locally stored songs, and updates its suggestions based on the user feedback using non-stationary Bayesian reinforcement learning. Created using React and the Electron.js framework.
Jupyter notebooks implementing Reinforcement Learning algorithms in Numpy and Tensorflow
A set of tools for machine learning (for the current day, there are active learning utilities and implementations of some stacking-based techniques).
a Python-based platformer infused with Q-Learning and dynamic level creation from simple JSON files.
Reinforcement Learning (COMP 579) Project
Epsilon-Greedy Q-Learning in a Multi-agent Environment
An implementation of the Epsilon Greedy and Thompson Sampling algorithms using NumPy, pandas and Matplotlib.
A multi agent reinforcement learning environment where two agents controlled by DRQNs play a custom version of the pursuit-evasion game.
Implementation of the Q-learning and SARSA algorithms to solve the CartPole-v1 environment. [Advance Machine Learning project - UniGe]
An epsilon-greedy Dueling Deep Q-Network Based on Prioritised Experience Replay to compute the minimal time path for traversing a maze.
Reinforcement Learning and Deeep reinforcement Learning
The simulation of Epsilon-Greedy and Thompson Sampling algorithms for Bayesian A/B Testing. The project shows how both algorithms find the optimal bandit and approximate the rewards of each bandit, given the true reward. Visualizations are done to demonstrate the learning process and convergence.
Add a description, image, and links to the epsilon-greedy topic page so that developers can more easily learn about it.
To associate your repository with the epsilon-greedy topic, visit your repo's landing page and select "manage topics."