-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make delta_timestamps
in the policy
#163
base: main
Are you sure you want to change the base?
Make delta_timestamps
in the policy
#163
Conversation
# This policy makes use of past observations and a horizon of actions for training. The dataset needs to know | ||
# about this. For that, the policy has a method that takes as input a frames-per-second (fps) argument and | ||
# returns a dictionary mapping each data key to a list of relative timestamps. For example. If we need a | ||
# horizon of 4 actions, starting from the "previous" frame and at 10 FPS, the dictionary would contain | ||
# {"action": [-0.1, 0.0, 0.1, 0.2]}. The policy is equipped with a method to produce this dictionary. | ||
# Here, we know that for the PushT simulation environment is 10. | ||
dataset.delta_timestamps = policy.make_delta_timestamps(fps=10) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so sorry, I am not a fan of this design. I think it's too complicated.
Another possible implementation if we dont want to use eval
, let's replace this:
delta_timestamps:
observation.image: "[i / ${fps} for i in range(1 - ${policy.n_obs_steps}, 1)]"
by this:
delta_timestamps:
observation.image: [-0.1, 0.0]
Ideally I want to grid search my context directly from command line:
python .../train.py delta_timestamps.observation.image=-1.0,-0.5,0.0
python .../train.py delta_timestamps.observation.image=-2.0,-1.5,-1.0,-0.5,0.0
python .../train.py delta_timestamps.observation.image=-0.3,-0.2,-0.1,0.0
This kind of stuff should be easy to do.
cc @aliberts for visibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing it mate. Okay, so:
-
Is it more complicated? The prior approach in this example was to hand-craft the delta timestamps which requires knowledge of the internal workings of diffusion policy that could otherwise be glossed over. I'm open to a less complicated design than mine though!
-
I'm sorry but I don't understand why someone would tweak
delta_timestamps
like that directly in the command line. It would break things. They'd have to tweakhorizon
,n_obs_steps
andfps
, and then based on those,delta_timestamps
would be uniquely determined, so if you make them specify it, they need to go and figure out how it's uniquely determined. (this also goes for your config yaml example where you explicitly write out the timestamps - and that's really not nice for things with long horizons)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually though, I realized you do have a point @Cadene. delta_timestamps
isn't necessarily uniquely determined. While its "topology" is determined (for example with DP n_obs_steps=2
means we have 1 negative delta and the current frame, no way around it) we might actually want a different delta magnitude. For example, say we are using DP with ALOHA at 50 FPS and want n_obs_steps=2
. Then delta_timestamps=[-0.05, 0.00]
with my current logic. But what If I just want 100 ms worth of context without having to increase n_obs_steps
, ie [-0.1, 0.0]
. This is perfectly valid for the DP logic, and is actually something I'd want to try.
I'll rethink the design.
What this does
This PR is a moves the
delta_timestamps
logic to the policy. The reasons for doing so are:eval
for dynamically evaluating a config expression.delta_timestamps
alone, so it should be made at runtime.How it was tested
Explain/show how you tested your changes.
Examples:
test_something
intests/test_stuff.py
.new_feature
and checked that training converges with policy X on dataset/environment Y.some_function
, it now runs X times faster than previously.How to checkout & try? (for the reviewer)
Provide a simple way for the reviewer to try out your changes.
Examples:
SECTION TO REMOVE BEFORE SUBMITTING YOUR PR
Note: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.
Note: Before submitting this PR, please read the contributor guideline.
This change is