You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have experience using Stable Baselines 3 as a module but am a beginner regarding its internal workings. I have a decent understanding of both multiagent and single-agent reinforcement learning.
My question is: In multiagent reinforcement learning using Stable Baselines 3, is it possible to provide different observation information to separate agents and have them learn independently? If so, how can this be specifically implemented?
I am using the gym-pybullet-drone repository for reinforcement learning of multi-drone control with Stable Baselines 3, which can be found here: gym-pybullet-drone.
As per the tutorial, I am executing the following code for multiagent reinforcement learning:
cd gym_pybullet_drones/examples/
python learn.py --multiagent true
Within learn.py, learning is conducted using Stable Baselines 3's PPO in the following manner:
train_env = make_vec_env(MultiHoverAviary,
env_kwargs=dict(num_drones=DEFAULT_AGENTS, obs=DEFAULT_OBS, act=DEFAULT_ACT),
n_envs=1,
seed=0
)
model = PPO('MlpPolicy',
train_env,
# tensorboard_log=filename+'/tb/',
verbose=1)
model.learn(total_timesteps=int(1e7) if local else int(1e2), # shorter training in GitHub Actions pytest
In this setup, the environment defined in MultiHoverAviary.py and its parent class BaseRLAviary.py includes the _computeObs(self) function, which combines information about all drones.
With this configuration and the learning function in learn.py, I understand that all agents share the same model and input the same information into both the Policy and Value networks for learning.
I want to modify the observations for each agent. Specifically, I want agent0 to only receive positional information about agent1, and agent1 to only receive positional information about agent0. I believe this might require setting up multiple models, but the current implementation in the gym_pybullet_drone repository with Stable Baselines 3 seems not to support this.
I am asking this question because I think someone well-versed with Stable Baselines 3 might know a solution. As discussed in this issue here, my understanding is that multiagent reinforcement learning settings are not a primary focus in Stable Baselines 3. However, any advice or solution for the above problem would be greatly appreciated.
Thank you.
Checklist
I have checked that there is no similar issue in the repo
SB3 does not officially support multi-agent RL. However, you can use the sb3 agent as one big agent with different observations and action spaces with "sub-agents", take a look at Pettingzoo examples.
❓ Question
Hello,
I have experience using Stable Baselines 3 as a module but am a beginner regarding its internal workings. I have a decent understanding of both multiagent and single-agent reinforcement learning.
My question is: In multiagent reinforcement learning using Stable Baselines 3, is it possible to provide different observation information to separate agents and have them learn independently? If so, how can this be specifically implemented?
I am using the
gym-pybullet-drone
repository for reinforcement learning of multi-drone control with Stable Baselines 3, which can be found here: gym-pybullet-drone.As per the tutorial, I am executing the following code for multiagent reinforcement learning:
Within
learn.py
, learning is conducted using Stable Baselines 3's PPO in the following manner:In this setup, the environment defined in MultiHoverAviary.py and its parent class
BaseRLAviary.py
includes the_computeObs(self)
function, which combines information about all drones.With this configuration and the learning function in
learn.py
, I understand that all agents share the same model and input the same information into both the Policy and Value networks for learning.I want to modify the observations for each agent. Specifically, I want agent0 to only receive positional information about agent1, and agent1 to only receive positional information about agent0. I believe this might require setting up multiple models, but the current implementation in the
gym_pybullet_drone
repository with Stable Baselines 3 seems not to support this.I am asking this question because I think someone well-versed with Stable Baselines 3 might know a solution. As discussed in this issue here, my understanding is that multiagent reinforcement learning settings are not a primary focus in Stable Baselines 3. However, any advice or solution for the above problem would be greatly appreciated.
Thank you.
Checklist
The text was updated successfully, but these errors were encountered: