Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I train an Agent on multiple Environments? #65

Open
lagidigu opened this issue Mar 1, 2019 · 3 comments
Open

How can I train an Agent on multiple Environments? #65

lagidigu opened this issue Mar 1, 2019 · 3 comments
Assignees

Comments

@lagidigu
Copy link

lagidigu commented Mar 1, 2019

Hi,

I read the GCP tutorial on how to set up dopamine, but I cannot find out how to train the agent/brain on multiple environments simultaneously like you did during the PPO/Rainbow training.

Would it just be a matter of creating several runners?

Thanks a lot in advance,

Cheers
Luc

@awjuliani awjuliani self-assigned this Mar 1, 2019
@awjuliani
Copy link
Contributor

Hi @lagidigu

With Dopamine it is only possible to train using a single environment at a time. OpenAI Baselines on the other hand allows multiple environments concurrently for a number of their algorithms (including PPO). Here are instructions for running Unity environments with baselines: https://github.com/Unity-Technologies/ml-agents/tree/master/gym-unity#running-openai-baselines-algorithms. The difference is that you will want to replace UnityEnv with ObstacleTowerEnv.

@MarcoMeter
Copy link
Contributor

If used like seen below, the environment cannot establish a connection to Python. The socket somehow fails. This is tested on two Windows machines. Surprisingly, if set to one environment, two instances of the Obstacle Tower build are launched.

```python
from obstacle_tower_env import ObstacleTowerEnv
import sys
import argparse
from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
from baselines.bench import Monitor
from baselines import logger
import baselines.ppo2.ppo2 as ppo2

import os

try:
    from mpi4py import MPI
except ImportError:
    MPI = None

def make_unity_env(env_filename, num_env, visual, start_index=1):
    """
    Create a wrapped, monitored Unity environment.
    """
    def make_env(rank, use_visual=True): # pylint: disable=C0111
        def _thunk():
            env = ObstacleTowerEnv(env_filename, retro=True, realtime_mode=True, worker_id=rank)
            env = Monitor(env, logger.get_dir() and os.path.join(logger.get_dir(), str(rank)))
            return env
        return _thunk
    return SubprocVecEnv([make_env(i + start_index) for i in range(num_env)])

def main():
    env = make_unity_env('./ObstacleTower/obstacletower', 1, True)
    ppo2.learn(
        network="mlp",
        env=env,
        total_timesteps=100000,
        lr=1e-3,
    )

if __name__ == '__main__':
    main()

@lagidigu
Copy link
Author

lagidigu commented Mar 4, 2019

Hi @lagidigu

With Dopamine it is only possible to train using a single environment at a time. OpenAI Baselines on the other hand allows multiple environments concurrently for a number of their algorithms (including PPO). Here are instructions for running Unity environments with baselines: https://github.com/Unity-Technologies/ml-agents/tree/master/gym-unity#running-openai-baselines-algorithms. The difference is that you will want to replace UnityEnv with ObstacleTowerEnv.

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants