Improve rollout utility #106

neo-alex · 2021-11-19T14:04:47Z

🚀 Feature

Fix, merge and improve rollout utilities

Pitch

There are currently 2 almost identical rollout utilities in utils.py (rollout and rollout_episode) that could be improved & merged into 1, which would avoid code duplication and be less confusing for end-users.

I propose following changes:

remove rollout_episode but add a boolean return_episodes parameter to rollout to optionally enable the same feature (without current bugs, namely when from_state is None or when num_episodes > 1)
clean save_file feature by introducing a file_formatter parameter enabling customisation of file content from collected episodes (alternatively, we could try to solve the 2 first points jointly e.g. by introducing episodes_formatter which, if not None, would collect episodes in a certain format and either return them or save them to file given the save_file parameter)
possibly add callbacks (e.g. step_callback & episode_callback) to give the end-user the option to add custom code within the rollout loops

The text was updated successfully, but these errors were encountered:

This fixes airbus#106 We merge rollout and rollout_episode together: - add a return_episodes boolean arg to rollout, deciding wether to return episodes - episodes are returned as a list of episodes, each episode being a tuple of observations, actions, and values (previously returned prematurely after one episode only one tuple even if num_episodes was >1) - update previous code using rollout_episode, by using rollout with return_episodes=True, and using the first episode of the list Fix verbose behaviour by setting back logger level to previous level at the end of the rollout. (Previously was setting once for all the logger level to debug, even when going out of rollout.)

This fixes airbus#106 We merge rollout and rollout_episode together: - add a return_episodes boolean arg to rollout, deciding wether to return episodes - episodes are returned as a list of episodes, each episode being a tuple of observations, actions, and values (previously returned prematurely after one episode only one tuple even if num_episodes was >1) - in rollout_episode, verbose=False was muting the logger.info("goal reached ..."), instead we introduce a parameter to change the level of this logging. So that in particular in MetaPolicy, we can relegate it at debug level. - update previous code using rollout_episode, by using rollout with return_episodes=True, and using the first episode of the list Fix verbose behaviour by setting back logger level to previous level at the end of the rollout. (Previously was setting once for all the logger level to debug, even when going out of rollout.)

This fixes #106 We merge rollout and rollout_episode together: - add a return_episodes boolean arg to rollout, deciding wether to return episodes - episodes are returned as a list of episodes, each episode being a tuple of observations, actions, and values (previously returned prematurely after one episode only one tuple even if num_episodes was >1) - in rollout_episode, verbose=False was muting the logger.info("goal reached ..."), instead we introduce a parameter to change the level of this logging. So that in particular in MetaPolicy, we can relegate it at debug level. - update previous code using rollout_episode, by using rollout with return_episodes=True, and using the first episode of the list Fix verbose behaviour by setting back logger level to previous level at the end of the rollout. (Previously was setting once for all the logger level to debug, even when going out of rollout.)

nhuet mentioned this issue May 21, 2024

Unify the rollout utilities #367

Merged

g-poveda closed this as completed in #367 May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve rollout utility #106

Improve rollout utility #106

neo-alex commented Nov 19, 2021

Improve rollout utility #106

Improve rollout utility #106

Comments

neo-alex commented Nov 19, 2021

🚀 Feature

Pitch