Release v0.1.21 Contextual Multi-armed Bandits 🎰🥷 · AgileRL/AgileRL

AgileRL v0.1.21 introduces contextual multi-armed bandit algorithms to the framework. Train agents to solve complex optimisation problems with our two new evolvable bandit algorithms!

This release includes the following updates:

Two new evolvable contextual bandit algorithms: Neural Contextual Bandits with UCB-based Exploration and Neural Thompson Sampling
A new contextual bandits training function, enabling the fastest and easiest training
A new BanditEnv class for converting any labelled dataset into a bandit learning environment
Tutorials on using AgileRL bandit algorithms with evolvable hyperparameter optimisation for SOTA results
New demo and benchmarking scripts for bandit algorithms
- more!

More updates will be coming soon!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.21 Contextual Multi-armed Bandits 🎰🥷