Descriptor-Conditioned Gradients MAP-Elites

Repository for:

MAP-Elites with Descriptor-Conditioned Gradients and Archive Distillation into a Single Policy, introducing DCG-MAP-Elites GECCO, and that received a Best Paper Award at GECCO 2023 in Lisbon.
Synergizing Quality-Diversity with Descriptor-Conditioned Reinforcement Learning, introducing DCG-MAP-Elites-AI, an extension of DCG-MAP-Elites GECCO.

Summary

DCG-MAP-Elites-AI builds upon PGA-MAP-Elites algorithm and introduces three key contributions:

The Policy Gradient variation operator is enhanced with a descriptor-conditioned critic that reconciles diversity search with gradient-based methods coming from reinforcement learning.
As a by-product of the critic's training, a descriptor-conditioned actor is trained, at no additional cost, distilling the knowledge of the population into one single versatile policy that can execute a diversity of high-performing behaviors.
In turn, we exploit the descriptor-conditioned actor by injecting it in the population, despite network architecture differences.

This repository builds on top of the QDax framework and includes four baselines and three ablation studies:

Baselines

Ablations

DCG-MAP-Elites GECCO
DCG-MAP-Elites without Actor Injection
DCG-MAP-Elites without a Descriptor-Conditioned Actor

Installation

To run this code, you need to clone the repository and install the required libraries with:

git clone https://github.com/adaptive-intelligent-robotics/DCG-MAP-Elites
pip install -r requirements.txt

However, we recommend using a containerized environment with Apptainer.

Apptainer

We provide an Apptainer/Singularity Definition file, to run the source code in a containerized environment in which all the experiments and figures can be reproduced. In the following, make sure you are at the root of the cloned repository.

To build a container using Apptainer/Singularity, use the provided apptainer/container.def file:

apptainer build --fakeroot --force --sandbox apptainer/container.sif apptainer/container.def

Then, you can run a shell within the container with:

apptainer shell --pwd /project/ --bind $(pwd):/project/ --cleanenv --containall --home /tmp/ --no-home --nv --workdir --writable apptainer/ apptainer/container.sif"

Run main experiments

To run any algorithms <algo>, on any environments <env>:

Build a container
Run a shell within the container, as explained in the previous section
In /project/, run python main.py env=<env> algo=<algo> seed=$RANDOM num_iterations=4000 to run for 1,024,000 evaluations
During training, the metrics, visualizations and plots of performance can be found in real time in the output/ directory

For example, to run DCG-MAP-Elites-AI on Ant Omni:

python main.py env=ant_omni algo=dcg_me seed=$RANDOM num_iterations=4000

The configurations for all algorithms and all environments can be found in the configs/ directory. Alternatively, they can be modified directly in the command line. For example, to increase num_critic_training_steps to 5000 in PGA-MAP-Elites, you can run:

python main.py env=walker2d_uni algo=pga_me seed=$RANDOM num_iterations=4000 algo.num_critic_training_steps=5000

Run reproducibility experiments

The reproducibility experiments load the saved archives from the main experiment (see previous section) and evaluate the expected QD score, expected distance to descriptor and expected max fitness of the populations of the different algorithms.

⚠️ Before running a reproducibility experiment, the main experiment for the corresponding environment and algorithm should be completed.

For example, to evaluate the reproducibility for QD-PG on AntTrap Omni, run:

python main_reproducibility.py env_name=anttrap_omni algo_name=qd_pg

The results will be saved in the output/reproducibility/ directory.

Figures

Once all the experiments are completed, any figures from the paper can be replicated with the scripts in the analysis/ directory.

Figure 1: analysis/plot_main.py
Figure 2: analysis/plot_archive.py
Figure 3: analysis/plot_ablation.py
Figure 4: analysis/plot_reproducibility.py
Figure 5: analysis/plot_elites.py

P-values

Once all the experiments are completed, any p-values from the paper can be replicated with the script analysis/p_values.py.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
analysis		analysis
apptainer		apptainer
configs		configs
qdax		qdax
README.md		README.md
main.py		main.py
main_ablation_actor.py		main_ablation_actor.py
main_ablation_ai.py		main_ablation_ai.py
main_dcg_me.py		main_dcg_me.py
main_dcg_me_gecco.py		main_dcg_me_gecco.py
main_me.py		main_me.py
main_me_es.py		main_me_es.py
main_pga_me.py		main_pga_me.py
main_qd_pg.py		main_qd_pg.py
main_reproducibility.py		main_reproducibility.py
requirements.txt		requirements.txt
utils.py		utils.py
visu.ipynb		visu.ipynb

adaptive-intelligent-robotics/DCG-MAP-Elites

Folders and files

Latest commit

History

Repository files navigation

Descriptor-Conditioned Gradients MAP-Elites

Summary

Baselines

Ablations

Installation

Apptainer

Run main experiments

Run reproducibility experiments

Figures

P-values

About

Resources

Stars

Watchers

Forks

Languages