INF581 Project - Reinforcement Learning Volleyball 2v2 competitive set-up

_{^{Baseline (purple) vs RL (blue)}}

This project is based on the ultimate volleyball environment built on Unity ML-Agents by Joy Zhang. Original code can be found here. In this environment, two agents play volleyball over a net. We have two goals in our project :

Improve the existing environment using reward engineering to increase training speed.
Develop and compare several methods to implement 2v2 volleyball games.

Please read our report for more information on the projet. This branch contains the code corresponding to the 2v2 case. To access the code for 1v1 case, please go to this branch.

Getting Started

Install the Unity ML-Agents toolkit (Release 19+) by following the installation instructions.
Download or clone this repo containing the volleyRL Unity project.
Open the volleyRL project in Unity (Unity Hub → Projects → Add → Select root folder for this repo).
Load the VolleyballMain scene (Project panel → Assets → Scenes → VolleyballMain.unity).
Click the ▶ button at the top of the window. This will run the agent in inference mode using the provided baseline model.

Training

If you previously changed Behavior Type to Heuristic Only, ensure that the Behavior Type is set back to Default (see Heuristic Mode).
Activate the virtual environment containing your installation of ml-agents.
Make a copy of the provided training config file in a convenient working directory.
Run from the command line mlagents-learn <path to config file> --run-id=<some_id> --time-scale=1
- Replace <path to config file> with the actual path to the file in Step 3
When you see the message "Start training by pressing the Play button in the Unity Editor", click ▶ within the Unity GUI.
From another terminal window, navigate to the same directory you ran Step 4 from, and run tensorboard --logdir results to observe the training process.

For more detailed instructions, check the ML-Agents getting started guide.

Environment Description 2v2 set-up

Goal: Get the ball to bounce in the opponent's side of the court while preventing the ball bouncing into your own court.

Action space:

4 discrete action branches:

Forward motion (3 possible actions: forward, backward, no action)
Side motion (3 possible actions: left, right, no action)
Jump (2 possible actions: jump, no action)
Touch (2 possible actions:touch, no action)

Action touch will either do a spike if the agent has jumped or a set if the agent is on the ground.

Observation space:

Total size: 15

Normalised directional vector from agent to ball (3)
Distance from agent to ball (1)
Normalised directional vector from agent to team mate (3)
Distance from agent to team mate (1)
Ball X, Y, Z velocity (3)
Agent X, Y, Z velocity (3)
Last player to touch the ball (1)

Reward function:

The project contains some examples of how the reward function can be defined. The base example gives a +1 reward each time the agent hits the ball over the net. Accordingly to our first objective, we worked to develop a more complex reward function that would increase training speed. Read our report to know more about it.

Teams

Trained models are available to be used directly. To use them on a team, set each of the player behavior type to Default and the model you want in the Model parameter in unity. Use a Hitter model for player 1 and Setter model for player 2. The following teams are included:

Main RL team

Hitter_RL.onnx - Agent trained to specialize in spiking
Setter_RL.onnx- Agent trained to specialize in defense and set

RL team trained without jump penalty (for comparison purpose)

Hitter_RL_without_jump_penalty.onnx - Agent trained to specialize in spiking, without jump penalty
Setter_RL_without_jump_penalty.onnx- Agent trained to specialize in defense and set, without jump penalty for the hitter

Hard-coded baseline

To use the hard-coded baseline, set the behavior type to Heuristic only for each player of the team.

Demo

_{^{A game between the baseline (purple) and the RL team (blue)}}

_{^{A long rally of spikes in one of our RL vs RL games}}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Assets		Assets
Packages		Packages
ProjectSettings		ProjectSettings
config		config
results/2_v_2_simpler_2		results/2_v_2_simpler_2
score_logs		score_logs
.gitignore		.gitignore
INF581__Final_Report__Volleyball.pdf		INF581__Final_Report__Volleyball.pdf
LICENSE.md		LICENSE.md
README.md		README.md
commandes.txt		commandes.txt
score_analysis.py		score_analysis.py

License

Virgile-Foussereau/volleyRL

Folders and files

Latest commit

History

Repository files navigation

INF581 Project - Reinforcement Learning Volleyball 2v2 competitive set-up

Contents

Getting Started

Training

Environment Description 2v2 set-up

Teams

Main RL team

RL team trained without jump penalty (for comparison purpose)

Hard-coded baseline

Demo

About

Resources

License

Stars

Watchers

Forks

Languages