Skip to content

takato86/shaper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

shaper

A library for shaping the reward in RL. This library includes the implementations of the following papers.

Installation

pip install -e .

How to use

Please write like the following script. You can check the examples of domain-specific achiever here

import shaper
from shaper.achiever.interface import AbstractAchiever
from shaper.aggregator.subgoal_based import DynamicTrajectoryAggregation
import gym

# How to create the reward shaping instance.
def is_success(done, info):
    if "is_success" in info:
        return info["is_success"]
    return done

# Achiever is domain-specific. You can see the implementation examples in "examples" directory.
achiever = AbstractAchiever()
aggregator = DynamicTrajectoryAggregation(achiever)
vfunc = aggregator.create_vfunc()
rs = shaper.SarsaRS(gamma, lr, aggregator, vfunc, is_success=is_success)

# How to use in RL loop.
env = gym.create("CartPole-v1")
pre_obs = env.reset()

for i in range(100):
    action = env.action_space.sample()
    obs, reward, done, info = env.step(action)
    shaping_reward = rs.step(pre_obs, action, reward, obs, done, info)

Aggregator objects

  1. DynamicTrajectoryAggregation
  2. Discretizer

Shaping objects

  1. SarsaRS
  2. SubgoalRS
  3. NaiveRS