InfoMARL

Here we use information regularization to promote cooperation / competition via intention signalling / hiding in a multi-agent RL problem. The environment is a simple, two-goal grid world built in OpenAI Gym based on the example here. The first agent, Alice, has access to the goal, is parameterized with a tabular policy and value function, and is trained using REINFORCE, based on an implementation here. Alice's policy is regularized with the mutual information between goal and action (given state), I(goal; action | state). Depending on the sign of the information weighting, this regularization encourages her to either signal or hide her private information about the goal. The second agent, Bob, does not have access to the goal, but instead must infer it purely from observing the behavior of Alice. Thus, information regularization of Alice directly affects the success of Bob. In summary, information regularization allows Alice to train alone, but to be prepared for cooperation / competition with a friend / foe (Bob) introduced later. More detailed notes can be found here.

TODOS:

make richer episode visualization
use I(t) = sum of info up until t, so that agent prefers revealing info later
find a lossy case
learn friend/foe policies and optimize mixture parameter
learn pi(beta) and optimize beta
try discounting kl / entropy into future (like Distral paper); for high enough beta, Alice should try not to terminate episodes
under what conditions might Alice "overshoot" to signal?
under what conditions are I(traj;goal) and I(action;goal|state) approx equal?

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
agents		agents
envs		envs
plotting		plotting
training		training
util		util
.gitignore		.gitignore
README.md		README.md
alice_config.py		alice_config.py
auto_config.py		auto_config.py
bob_config.py		bob_config.py
env_config.py		env_config.py
play_episode.py		play_episode.py
test_artisanal_policy.py		test_artisanal_policy.py
test_env.py		test_env.py
train_alice.py		train_alice.py
train_bob.py		train_bob.py

djstrouse/InfoMARL

Folders and files

Latest commit

History

Repository files navigation

InfoMARL

About

Resources

Stars

Watchers

Forks

Languages