How can I train an Agent on multiple Environments? #65

lagidigu · 2019-03-01T16:45:04Z

Hi,

I read the GCP tutorial on how to set up dopamine, but I cannot find out how to train the agent/brain on multiple environments simultaneously like you did during the PPO/Rainbow training.

Would it just be a matter of creating several runners?

Thanks a lot in advance,

Cheers
Luc

awjuliani · 2019-03-01T17:35:38Z

Hi @lagidigu

With Dopamine it is only possible to train using a single environment at a time. OpenAI Baselines on the other hand allows multiple environments concurrently for a number of their algorithms (including PPO). Here are instructions for running Unity environments with baselines: https://github.com/Unity-Technologies/ml-agents/tree/master/gym-unity#running-openai-baselines-algorithms. The difference is that you will want to replace UnityEnv with ObstacleTowerEnv.

MarcoMeter · 2019-03-01T22:03:33Z

If used like seen below, the environment cannot establish a connection to Python. The socket somehow fails. This is tested on two Windows machines. Surprisingly, if set to one environment, two instances of the Obstacle Tower build are launched.

```python
from obstacle_tower_env import ObstacleTowerEnv
import sys
import argparse
from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
from baselines.bench import Monitor
from baselines import logger
import baselines.ppo2.ppo2 as ppo2

import os

try:
    from mpi4py import MPI
except ImportError:
    MPI = None

def make_unity_env(env_filename, num_env, visual, start_index=1):
    """
    Create a wrapped, monitored Unity environment.
    """
    def make_env(rank, use_visual=True): # pylint: disable=C0111
        def _thunk():
            env = ObstacleTowerEnv(env_filename, retro=True, realtime_mode=True, worker_id=rank)
            env = Monitor(env, logger.get_dir() and os.path.join(logger.get_dir(), str(rank)))
            return env
        return _thunk
    return SubprocVecEnv([make_env(i + start_index) for i in range(num_env)])

def main():
    env = make_unity_env('./ObstacleTower/obstacletower', 1, True)
    ppo2.learn(
        network="mlp",
        env=env,
        total_timesteps=100000,
        lr=1e-3,
    )

if __name__ == '__main__':
    main()

lagidigu · 2019-03-04T09:56:47Z

Hi @lagidigu

With Dopamine it is only possible to train using a single environment at a time. OpenAI Baselines on the other hand allows multiple environments concurrently for a number of their algorithms (including PPO). Here are instructions for running Unity environments with baselines: https://github.com/Unity-Technologies/ml-agents/tree/master/gym-unity#running-openai-baselines-algorithms. The difference is that you will want to replace UnityEnv with ObstacleTowerEnv.

Thanks a lot!

awjuliani self-assigned this Mar 1, 2019

awjuliani added the discussion label Mar 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I train an Agent on multiple Environments? #65

How can I train an Agent on multiple Environments? #65

lagidigu commented Mar 1, 2019

awjuliani commented Mar 1, 2019

MarcoMeter commented Mar 1, 2019

lagidigu commented Mar 4, 2019

How can I train an Agent on multiple Environments? #65

How can I train an Agent on multiple Environments? #65

Comments

lagidigu commented Mar 1, 2019

awjuliani commented Mar 1, 2019

MarcoMeter commented Mar 1, 2019

lagidigu commented Mar 4, 2019