#

epsilon-greedy

Here are 77 public repositories matching this topic...

mabby

thetawom / mabby

A multi-armed bandit (MAB) simulation library in Python

python reinforcement-learning simulation probability artificial-intelligence thompson-sampling epsilon-greedy multi-armed-bandits agent-based-simulation

Updated May 22, 2024
Python

ShaikRiyazSandy / Clustering

Problem Statement Perform clustering (Hierarchical,K means clustering and DBSCAN) for the airlines data to obtain optimum number of clusters. Content This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas

data-science epsilon-greedy clustering-algorithm kmeans-clustering dbscan-clustering heirarchical-clustering

Updated Apr 20, 2024
Jupyter Notebook

ValentinaZangirolami / DRL

Deep Recurrent Q-Network with different exploration strategies for self-driving cars (using AirSim)

reinforcement-learning deep-learning tensorflow deep-reinforcement-learning epsilon-greedy self-driving-car softmax airsim deep-recurrent-q-network drqn exploration-strategy softmax-exploration max-boltzmann-exploration

Updated Mar 26, 2024
Python

roaked / snake-evolutionary-reinforcement-learning

parameter optimization of a reinforcement learning deep Q network with memory replay buffer using genetic algorithm in the snake game. base code for snake env from codecamp

deep-neural-networks reinforcement-learning optimization genetic-algorithm deep-reinforcement-learning neuroevolution snake-game epsilon-greedy evolutionary-algorithm evolutionary-strategy stochastic-optimization fitness-function snake-ai memory-replay optimistic-exploration

Updated Mar 1, 2024
Python

supersjgk / Reinforcement_Learning

Playing with Reinforcement Learning

reinforcement-learning openai-gym q-learning python3 gym epsilon-greedy ppo

Updated Jan 5, 2024
Jupyter Notebook

Taabannn / intro-rl

This repository contains Reinforcement Learning course projects...

reinforcement-learning jupyter-notebook python3 statistical-inference epsilon-greedy greedy-approach ucb-algorithm gradient-based-bandit

Updated Nov 6, 2023
Jupyter Notebook

cyberquill / Riyaaz

A content-based music recommendation system, that suggests playlists made from the locally stored songs, and updates its suggestions based on the user feedback using non-stationary Bayesian reinforcement learning. Created using React and the Electron.js framework.

electron react data-science reinforcement-learning clustering jupyter-notebook music-recommendation artificial-intelligence epsilon-greedy librosa

Updated Oct 5, 2023
Jupyter Notebook

bmarroc / reinforcement-learning

Jupyter notebooks implementing Reinforcement Learning algorithms in Numpy and Tensorflow

monte-carlo q-learning epsilon-greedy policy-gradient sarsa dynamic-programming tdl policy-evaluation markov-decision-processes policy-iteration function-approximation bellman-equation policy-improvement

Updated Sep 1, 2023
Jupyter Notebook

Nikolay-Lysenko / dsawl

A set of tools for machine learning (for the current day, there are active learning utilities and implementations of some stacking-based techniques).

epsilon-greedy active-learning stacking categorical-features out-of-fold target-encoding

Updated Aug 27, 2023
Python

StepanTita / q-learning

a Python-based platformer infused with Q-Learning and dynamic level creation from simple JSON files.

python machine-learning reinforcement-learning machine-learning-algorithms q-learning epsilon-greedy reinforcement-learning-algorithms game-ai reinforcement-learning-playground reinforcement-learning-environments q-learning-algorithm platformer-game

Updated Aug 11, 2023
Python

Sagarnandeshwar / Bandit_Algorithms

Reinforcement Learning (COMP 579) Project

reinforcement-learning thompson-sampling epsilon-greedy ucb bernoulli-distribution bandit-algorithms exploration-exploitation

Updated Aug 4, 2023
Jupyter Notebook

hritikb / Reinforcement-Learning-Algorithms

reinforcement-learning q-learning grid-world epsilon-greedy sarsa dynamic-programming multi-armed-bandits policy-iteration value-iteration monte-carlo-methods temporal-differencing-learning upper-confidence-bound gradient-bandit optimistic-inital-values greedy-policy

Updated Jun 29, 2023
Jupyter Notebook

DimitrisPatiniotis / epsilon-greedy-Q-learning

Epsilon-Greedy Q-Learning in a Multi-agent Environment

reinforcement-learning q-learning epsilon-greedy cooperative-environments

Updated Jun 24, 2023
Python

marmiskarian / AB-testing

An implementation of the Epsilon Greedy and Thompson Sampling algorithms using NumPy, pandas and Matplotlib.

thompson-sampling epsilon-greedy ab-testing marketing-analytics

Updated Jun 21, 2023
Jupyter Notebook

1391819 / MA-seek

A multi agent reinforcement learning environment where two agents controlled by DRQNs play a custom version of the pursuit-evasion game.

tensorflow epsilon-greedy pomdp drqn experience-replay marl

Updated Jun 16, 2023
Python

ErfanFathi / RL_Cartpole

Implementation of the Q-learning and SARSA algorithms to solve the CartPole-v1 environment. [Advance Machine Learning project - UniGe]

reinforcement-learning q-learning python3 epsilon-greedy sarsa cartpole-v1 q-learning-vs-sarsa

Updated Jun 9, 2023
Python

Resh-97 / Dynamic_Maze_Solving

An epsilon-greedy Dueling Deep Q-Network Based on Prioritised Experience Replay to compute the minimal time path for traversing a maze.

reinforcement-learning q-learning epsilon-greedy dueling-dqn minimal-path dynamic-maze comp6247

Updated Jun 7, 2023
Python

Wb-az / Reinforcement-Learning

Reinforcement Learning and Deeep reinforcement Learning

python deep-learning deep-reinforcement-learning epsilon-greedy pytorch-implementation soft-actor-critic q-learning-algorithm custom-environment policy-iteration-algorithm bipedalwalker-v3 lunarlandercontinuous-v2 learning-policies soft-actor-critic-continuous

Updated Jun 5, 2023
Jupyter Notebook

HrayrMuradyan / A-B-Testing

The simulation of Epsilon-Greedy and Thompson Sampling algorithms for Bayesian A/B Testing. The project shows how both algorithms find the optimal bandit and approximate the rewards of each bandit, given the true reward. Visualizations are done to demonstrate the learning process and convergence.

thompson-sampling epsilon-greedy ab-testing abtesting a-b-testing epsilon-greedy-exploration bayesian-ab-testing

Updated Apr 13, 2023
Jupyter Notebook

ArianQazvini / Ai-Reinforcement_Learning

reinforcement-learning q-learning epsilon-greedy value-iteration temporal-differencing-learning

Updated Feb 7, 2023
Python

Improve this page

Add a description, image, and links to the epsilon-greedy topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the epsilon-greedy topic, visit your repo's landing page and select "manage topics."