This repository provides implementations of a Q-learning agent to balance a pole on a cart. The agent is implemented in two different environments:
-
The default
CartPole-v1
environment provided by OpenAI'sgym
. -
A custom
CustomCartPole-v1
environment where external disturbances (like wind) are introduced and custom reward functions are used.
- Python 3.x
- numpy
- gym
- (optional) Pygame, if using Gym's 'human' rendering mode.
To install the required libraries, you can use:
pip install -r requirements.txt
.
|-- run.py # Main execution script
|-- src/
| |-- app/
| | |-- custom_env.py # Custom environment definition
| | -- qlearning_agent.py # Q-learning agent definition | |-- modules/ | | |-- testing.py # Agent testing procedures | |
-- training.py # Agent training procedures
-- log/
-- training.log # Log file for debugging during training
python run.py
python run.py --custom
The Q-learning agent is implemented with functionalities to:
- Choose an action based on an ε-greedy strategy.
- Learn by updating the Q-table based on the Bellman equation.
Given that the state space of the CartPole environment is continuous, the states are discretized to fit into a Q-table.
During training:
- The agent learns by interacting with the environment.
- The Q-values are updated based on the agent's interactions.
- Logging is done to monitor ε-greedy choice and Q-table updates.
After training, the Q-learning agent's performance is evaluated on several episodes. The agent uses the learned Q-values to select actions, and the average reward over the test trials is printed.
The custom environment introduces a few modifications:
- External disturbances, simulating wind, which can push the cart.
- A customized reward function which penalizes the agent based on the cart's position and the pendulum's angle.
Both implementations feature a debug
option during training. If set to debug=True
, the log captures the nature of action chosen (exploration vs. exploitation) and any changes to the Q-table. The log can be found at log/training.log
.
This venture was realized with the support of the OpenAI platform, and it draws inspiration from tutorials and resources available on the OpenAI Gym library.