gym-marl-reconnaissance

Gym environments for heterogeneous multi-agent reinforcement learning in non-stationary worlds

This repository's master branch is work in progress, please git pull frequently and feel free to open new issues for any undesired, unexpected, or (presumably) incorrect behavior. Thanks 🙏

Also see how to programmatically control real RoboMaster hardware (S1 UGV, Tello Talent UAV) in Python here

Install on Ubuntu/macOS

(optional) Create and access a Python 3.7 environment using conda

$ conda create -n recon python=3.7                                 # Create environment (named 'recon' here)
$ conda activate recon                                             # Activate environment 'recon'

Clone and install the gym-marl-reconnaissance repository

$ git clone https://github.com/JacopoPan/gym-marl-reconnaissance   # Clone repository
$ cd gym-marl-reconnaissance                                       # Enter the repository
$ pip install -e .                                                 # Install the repository

Configure

Set the parameters of the simulation environment

seed: -1
ctrl_freq: 2
pyb_freq: 30
gui: False
record: False
episode_length_sec: 30
action_type: 'task_assignment'      # Alternatively, 'tracking'
obs_type: 'global'
reward_choice: 'reward_c'
adv_type: 'avoidant'                # Alternatively, 'blind'
visibility_threshold: 12
setup:
  edge: 10
  obstacles: 0
  tt: 1
  s1: 1
  adv: 2
  neu: 1
debug: False

Use

Step an environment with random action inputs

$ python3 ./experiments/debug.py --random True

Step an environment with a greedy policy (only for task_assignment)

$ python3 ./experiments/debug.py

Learn using stable-baselines3

$ python3 ./experiments/train.py --algo <a2c | ppo> --yaml <filname in ./experiments/configurations/>

Replay a trained agent

$ python3 ./experiments/test.py --exp ./results/exp--<algo>--<config>--<date>_<time>

Results

Task assignment (1 UAV and 1 UGV vs 2 targets and 1 neutral)

Tracking (1 UAV or 1 UGV vs 1 target, with or without 1 neutral)

University of Toronto's Dynamic Systems Lab / Vector Institute / Mitacs

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
experiments		experiments
figures		figures
gym_marl_reconnaissance		gym_marl_reconnaissance
results		results
saves		saves
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experiments

experiments

figures

figures