[RLlib] Index Error with GPU #45418

dkunz49 · 2024-05-17T19:30:08Z

What happened + What you expected to happen

When running a PPO training session with a single GPU, an index error occurs in /torch_policy_v2.py (see traceback below). The error always seems to occur in the fifth iteration, if that helps I've seen similar reports this from several years ago. In those cases the error only seemed to occur when there was only one GPU. If that were the case, I would think that it would be fixed by now.

Traceback (most recent call last):
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/dkunz/.vscode/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "/home/dkunz/python3/gymnasium/gymCopter/gymCopterPPO.py", line 46, in
results = algo.train()
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 331, in train
raise skipped from exception_cause(skipped)
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 328, in train
result = self.step()
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 873, in step
train_results, train_iter_ctx = self._run_one_training_iteration()
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 3156, in _run_one_training_iteration
results = self.training_step()
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 428, in training_step
return self._training_step_old_and_hybrid_api_stacks()
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 587, in _training_step_old_and_hybrid_api_stacks
train_results = multi_gpu_train_one_step(self, train_batch)
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/execution/train_ops.py", line 152, in multi_gpu_train_one_step
num_loaded_samples[policy_id] = local_worker.policy_map[
File "/home/dkunz/anaconda3/envs/envRL/lib/python3.10/site-packages/ray/rllib/policy/torch_policy_v2.py", line 802, in load_batch_into_buffer
return len(slices[0])
IndexError: list index out of range

Versions / Dependencies

Ubuntu 22.04 LTS
Python 3.10.14
Ray 2.22.0
PyTorch 2.3.o

Reproduction script

I don't have a short script, but here's the PPO script that I'm using. The environment is quite complex, and that could be the problem, but it mostly seems to be running satisfactorily.

"""
gymCopterPPO.py
"""

Third-party imports

import ray
from ray.tune.registry import register_env
from ray.rllib.algorithms.ppo import PPOConfig

Local application imports

from gymCopter import GymCopter as gymCopterEnv

def env_creator(env_config):
"""
Function required to register gymnasium environments
"""
gc = gymCopterEnv(env_config)
gc.copter.auxdata['EngOper'] = False
return gc

#################################################################

if name == 'main':
ray.init()
register_env('gymCopterEnv', env_creator)
# Configure PPO
ppo_config = PPOConfig()
ppo_config.training(gamma=0.9447)
ppo_config.training(lr=5.0e-5)
ppo_config.training(lambda_=0.9556)
ppo_config.training(train_batch_size=5000)
ppo_config.training(model={'fcnet_hiddens': [256, 256]})
ppo_config.environment(env='gymCopterEnv')
ppo_config.environment(env_config={'hbar0': None, 'vbar0': None, 'EngOper': False, 'render_mode': None})
ppo_config.framework(framework='torch')
ppo_config.rollouts(num_rollout_workers=8)
ppo_config.debugging(log_level='ERROR')
ppo_config.resources(num_gpus=0.1)
ppo_config.resources(num_cpus_per_worker=2)
ppo_config.resources(num_gpus_per_worker=0.1)
# Build an algorithm from from the configuration
algo = ppo_config.build()
# Train for n iterations and report results
max_reward_mean = -1.0
for n in range(10):
results = algo.train()
episode_reward_mean = results['episode_reward_mean']
if episode_reward_mean > max_reward_mean:
max_reward_mean = episode_reward_mean
print(f'n = {n}: Episode Mean Reward = {episode_reward_mean}; Max Mean Reward = {max_reward_mean}')
ray.shutdown()

Issue Severity

High: It blocks me from completing my task.

dkunz49 · 2024-05-17T19:44:42Z

I'm pretty new to RLlib, so it's likely that the issue is my problem, and that I don't know enough to identify it.

dkunz49 · 2024-05-22T17:47:12Z

Discovered problems with gymCopter and the conda environment. After fixing those, the problems seems to be solved.

dkunz49 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 17, 2024

anyscalesam added the rllib RLlib related issues label May 20, 2024

simonsays1980 added rllib-oldstack-cleanup Issues related to cleaning up classes, utilities on the old API stack and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 21, 2024

dkunz49 closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Index Error with GPU #45418

[RLlib] Index Error with GPU #45418

dkunz49 commented May 17, 2024

dkunz49 commented May 17, 2024

dkunz49 commented May 22, 2024

[RLlib] Index Error with GPU #45418

[RLlib] Index Error with GPU #45418

Comments

dkunz49 commented May 17, 2024

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Third-party imports

Local application imports

Issue Severity

dkunz49 commented May 17, 2024

dkunz49 commented May 22, 2024