Skip to content

ucl-dark/skillhack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkillHack

Source code for Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning (CoLLAs 2022).

alt text

SkillHack is a repository for skill based learning based on MiniHack and the NetHack Learning Environment (NLE). SkillHack consists of 16 simple skill acquistion environments and 8 complex task environments. The task environments are difficult to solve due to the large state-action space and sparsity of rewards, but can be made tractable by transferring knowledge gained from the simpler skill acquistion environments.

Installation

This repository is dependent on the MiniHack repo (https://github.com/facebookresearch/minihack).

How to run a Skill Transfer experiment

First, create a directory to store pretrained skills in and set the environment variable SKILL_TRANSFER_HOME to point to this directory. This directory must also include a file called skill_config.yaml. A default setting for this file can be found in the same directory as this readme.

Next, this directory must be populated with pretrained skill experts. These are obtained by running skill_transfer_polyhydra.py on a skill-specific environment.

Full Skill Expert Training Walkthrough

In this example we train the fight skill.

First, the agent is trained with the following command

python -m agent.polybeast.skill_transfer_polyhydra model=baseline env=mini_skill_fight use_lstm=false total_steps=1e7

Once the agent is trained, the final weights are automatically copied to SKILL_TRANSFER_HOME and renamed to the name of the skill environment. So in this example, the skill would be saved at

${SKILL_TRANSFER_HOME}/mini_skill_fight.tar

If there already exists a file with this name in SKILL_TRANSFER_HOME (i.e. if you've already trained an agent on this skill) then the new skill expert is saved as

${SKILL_TRANSFER_HOME}/${ENV_NAME}_${CURRENT_TIME_SECONDS}.tar

Tasks that make use of skills will use the path given in the first example (i.e. without the current time). If you want to use your newer agent, you need to delete the old file and rename the new file to remove the time from it.

Training other skills

Repeat this for all skills.

Remember all skills need to be trained with use_lstm=false

The full list of skills to be trained is

  • mini_skill_apply_frost_horn
  • mini_skill_eat
  • mini_skill_fight
  • mini_skill_nav_blind
  • mini_skill_nav_lava
  • mini_skill_nav_lava_to_amulet
  • mini_skill_nav_water
  • mini_skill_pick_up
  • mini_skill_put_on
  • mini_skill_take_off
  • mini_skill_throw
  • mini_skill_unlock
  • mini_skill_wear
  • mini_skill_wield
  • mini_skill_zap_cold
  • mini_skill_zap_death

Training on Tasks

If the relevant skills for the environment are not present in SKILL_TRANSFER_HOME an error will be shown indicating which skill is missing. The skill transfer specific models are

  • foc: Options Framework
  • ks: Kickstarting
  • hks: Hierarchical Kickstarting

The tasks created for skill transfer are

  • mini_simple_seq: Battle
  • mini_simple_union: Over or Around
  • mini_simple_intersection: Prepare for Battle
  • mini_simple_random: Target Practice
  • mini_lc_freeze: Frozen Lava Cross
  • mini_medusa: Medusa
  • mini_mimic: Identify Mimic
  • mini_seamonsters: Sea Monsters

So, for example, to run hierarchical kickstarting on the Target Practice environment, one would call

python -m agent.polybeast.skill_transfer_polyhydra model=hks env=mini_simple_random

With all other parameters being able to be set in the same way as with polyhydra.py

Training on Tasks

The runs from the paper can be repeated with the following command. If you don't want to run with wandb, set wandb=false.
python -m agent.polybeast.skill_transfer_polyhydra --multirun model=ks,foc,hks,baseline env=mini_simple_seq,mini_simple_intersection,mini_simple_union,mini_simple_random,mini_lc_freeze,mini_medusa,mini_mimic,mini_seamonsters name=1,2,3,4,5,6,7,8,9,10,11,12 total_steps=2.5e8 group=<YOUR_WANDB_GROUP> hks_max_uniform_weight=20 hks_min_uniform_prop=0 train_with_all_skills=false ks_min_lambda_prop=0.05 hks_max_uniform_time=2e7 entity=<YOUR_WANDB_ENTITY> project=<YOUR_WANDB_PROJECT>

Final Notes

  • If you want to train with the fixed version of nav_blind, go to data/tasks/tasks.json and replace mini_skill_nav_blind with mini_skill_nav_blind_fixed

Citation

If you make use of this code in your own work, please cite our paper:

@misc{matthews2022hierarchical,
  url = {https://arxiv.org/abs/2207.11584},
  author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rockt{\"a}schel, Tim},
  title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

About

SkillHack: A Benchmark for Skill Transfer in Open-Ended Reinforcement Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages