Here is a java program aiming to give the RL beginners a better understanding of Q-learning.
If you are new to Reinforcement Learning, here is what you want for Q-learning !!!
The goal is to train the agent to find path to the destination
- Red square represents the agent
- Black square represents the rival, agent will get -100 punishment after falling into rivers
- Grey square represents the wall, agent cannot go through the wall
- Blue square represents the goal destination that agent should arrive, agent will get 300 reward while arriving
- Option: Show Q tables
You can visualize the four action values (up, down, left and right) in each square box
- Option:Show routes
Also See the decided path of the agent, trained by cumulative experiments
- Clone repo to local and import it into eclipse
- Manually add external jar (/jars/designgridlayout-1.11.jar) to project
- Run
GameRunner.java
as java application
In the Control Panel, you can adjust the params of gamma, alpha, epislon in the formula to see the performance of algorithm
You can also determine the time interval(millisecond) between two frames and the largest training steps.
And you can also simulate your training results with Demo button
while (steps <= iteration && !reset) {
current_s=agent.randStart();
while (!agent.arriveIn(cliff.location, current_s)
&& !agent.arriveIn(terminal.location, current_s)) {
action = agent.greedyAction(current_s);
reward = agent.getReward(current_s, action);
newstate = agent.getNextState(current_s, action);
max_Q = agent.getMaxQvalue(newstate);
this_Q = agent.getQvalue(current_s, action);
new_Q = this_Q + alpha
* (reward + gamma * max_Q - this_Q);
agent.updateQtable(current_s, action, new_Q);
current_s = newstate;
graphUpdate(current_s,steps);
Thread.sleep(time_interval);
}
// the last state is the arrival of the terminal
graphUpdate(current_s, steps);
steps++;
}