[Draft] Coding trees with TensorDict #146

tcbegley · 2023-01-05T14:31:41Z

Description

This PR contains a draft implementation of support for trees using TensorDict.

We allow the user to create a tree structure where each node in the tree is a tensordict with the same keys. The entire contents of the tree are backed by a single source tensordict with pre-allocated memory. As nodes are added to the tree the data in the node is stored in the source. The primary advantage of this setup is that collecting the data from multiple nodes requires us to simply index the source which can be done very efficiently compared to a naive alternative which would require us to iterate over the nodes we want and stack the results.

The interface is very preliminary and up for debate. The tests should give an idea of usage, here's a very basic example.

>>> import torch
>>> from tensordict import TensorDict
>>> from tensordict.prototype import make_tree

>>> # create the root node from a tensordict
>>> # preallocate memory for 100 nodes in the source
>>> root = make_tree(TensorDict({"data": torch.rand(2, 3, 4)}, [2, 3]), n_nodes=100)

>>> # adding a new tensordict as a child creates a new node in the tree
>>> root["left"] = TensorDict({"data": torch.rand(2, 3, 4)}, [2, 3])

>>> # we can now gather the data for the two nodes in the tree
>>> root.get_multiple_items("data", ("left", "data"))
tensor(...)  # shape torch.Size([2, 2, 3, 4])

>>> # alternatively we can create a tree from a nested tensordict
>>> root = make_tree(
...     TensorDict(
...         {"data": torch.ones(2, 3, 4), "left": TensorDict({"data": torch.zeros(2, 3, 4)}, [2, 3])},
...         [2, 3]
...     ),
...     n_nodes=100,
... )
>>> root.get_multiple_items("data", ("left", "data"))

tensordict/prototype/tree.py

vmoens

This is super cool!
Can we add a file in the benchmark directory to test the tree against a plain tensordict implementation (if that makes sense)?

tcbegley · 2023-01-05T17:14:19Z

Added some simple benchmarks. lmk what other things you think we should benchmark

get_multiple_items
tensordict: 2.145204999993439
tree: 0.9129869579919614

get_multiple_items_deep
tensordict 2.4592998750013066
tree 0.8956774580001365

vmoens · 2023-01-05T17:14:40Z

impressive!

tcbegley added 6 commits January 3, 2023 14:04

Initial implementation of TensorDict tree

fe80bf6

Create tree from TensorDict and bug fixes

24dd555

Add basic tests

bf79d48

Check keys are valid when setting values

9cc0f90

More comprehensive key checks

dfe15dd

remove unused imports

56a9331

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 5, 2023

tcbegley marked this pull request as draft January 5, 2023 14:32

Fix broken get method

43b318d

tcbegley commented Jan 5, 2023

View reviewed changes

tensordict/prototype/tree.py Outdated Show resolved Hide resolved

Delete redundant __getitem__

c69b76e

vmoens reviewed Jan 5, 2023

View reviewed changes

vmoens added the enhancement New feature or request label Jan 5, 2023

Add simple tree benchmarks

37e282f

apbard force-pushed the main branch from c270a5e to 05ee720 Compare April 7, 2023 15:13

vmoens mentioned this pull request Sep 5, 2023

[WIP, Feature] MCTS pytorch/rl#629

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Coding trees with TensorDict #146

[Draft] Coding trees with TensorDict #146

tcbegley commented Jan 5, 2023

vmoens left a comment

tcbegley commented Jan 5, 2023

vmoens commented Jan 5, 2023

[Draft] Coding trees with TensorDict #146

Are you sure you want to change the base?

[Draft] Coding trees with TensorDict #146

Conversation

tcbegley commented Jan 5, 2023

Description

vmoens left a comment

Choose a reason for hiding this comment

tcbegley commented Jan 5, 2023

vmoens commented Jan 5, 2023