[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators #66

RaghuSpaceRajan · 2021-04-07T11:43:47Z

🚀 Feature Request

Create Replay Buffer Iterators that can select training and validation data in various "interesting" ways, similar to TransitionIterator and BootStrapIterator in
https://github.com/facebookresearch/mbrl-lib/blob/b0aabd79941efe8b56bcabbd1b43bf497b9b1746/mbrl/replay_buffer.py

Examples:

Select transitions from highly-rewarding trajectories - this could be used to perform analyses of how data selection impacts MBRL, objective mismatch, etc.
Select transitions randomly from the replay buffer to have a fixed size of training/validation data.

Motivation

This would make analysis similar to https://arxiv.org/abs/2002.04523 and https://arxiv.org/abs/2102.13651 easy to perform.

Pitch

It should be fairly easy to implement similar to TransitionIterator and BootStrapIterator above. (Taking care of trajectory/episodic boundaries could be a bit tricky.)

The text was updated successfully, but these errors were encountered:

luisenp · 2021-04-07T13:07:59Z

Thanks @RaghuSpaceRajan . cc'ing @natolambert since this is highly relevant to his work. I think this proposal is the most straightforward way to do this on the data management side.

natolambert · 2021-04-07T16:44:38Z

Yes, I have a version of this in my private repo, I will create a PR soon for it. The way I did it was for associating a "weight" for each transition, but some of the core functionality was a function to "update weights" for each trajectories. When updating the weights, it would be easy to create a ranking or heuristic mapping of some sort.

natolambert · 2021-04-07T17:05:44Z

Related comment, I think it may be worthwhile to have an optional "rich logging" mode, where things like candidate actions, action sequences (plans) at each step, trajectories, and more are saved for every trial in the learning process. It accumulates a lot, but having access to this is useful for debugging.

luisenp · 2021-04-07T17:26:30Z

Feel free to open a feature request issue for this as well @natolambert

RaghuSpaceRajan added the enhancement New feature or request label Apr 7, 2021

natolambert mentioned this issue Jul 26, 2022

Add trajectory-based dynamics model #158

Closed

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators #66

[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators #66

RaghuSpaceRajan commented Apr 7, 2021

luisenp commented Apr 7, 2021

natolambert commented Apr 7, 2021

natolambert commented Apr 7, 2021

luisenp commented Apr 7, 2021

[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators #66

[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators #66

Comments

RaghuSpaceRajan commented Apr 7, 2021

🚀 Feature Request

Motivation

Pitch

luisenp commented Apr 7, 2021

natolambert commented Apr 7, 2021

natolambert commented Apr 7, 2021

luisenp commented Apr 7, 2021