New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge TemporalData
with Data
/HeteroData
[2/n]
#8739
base: master
Are you sure you want to change the base?
Merge TemporalData
with Data
/HeteroData
[2/n]
#8739
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8739 +/- ##
==========================================
- Coverage 89.86% 89.23% -0.63%
==========================================
Files 468 468
Lines 30039 30070 +31
==========================================
- Hits 26995 26834 -161
- Misses 3044 3236 +192 ☔ View full report in Codecov by Sentry. |
89f4a25
to
5093a37
Compare
torch_geometric/data/hetero_data.py
Outdated
out[t].sort_by_time() | ||
return out | ||
|
||
def snapshot(self, types: Union[List[EdgeType], List[NodeType]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about snapshot
which is relevant to this PR and #8454.
Assume nodes n1 and n2 with time equals 1 and 4 respectively, and an edge e1 (between n1 and n2 ) with time = 5.
On calling snapshot(start_time = 4, end_time = 5)
what would the output look like, from the code it looks like the returned graph will have node n2 and edge e1, but e1 doens't have a source node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current design, time
cannot be an edge and node attribute at the same time, so this scenario is impossible. As these functionalities are moved directly from TemporalData
I would consider underlying data rather as a list of events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some part of the code that explicitly prevents this?
Does this also apply to hetero graphs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2x Yes, for example at BaseStorage and NeighborSampler.
torch_geometric/data/hetero_data.py
Outdated
out[t].snapshot(start_time, end_time) | ||
return out | ||
|
||
def up_to(self, types: Union[List[EdgeType], List[NodeType]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both up_to
and snapshot
, do we reindex edges. Like if we remove the 0th node from the graph, then does the 1st node become the 0th node in the edge index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't. But that's a good point. When I integrated TemporalData
into Data
I kept the assumption that we represent events, so time
as a node attribute is currently not well defined.
5093a37
to
2eb2ccd
Compare
2eb2ccd
to
5dcdc0b
Compare
This PR attempts to merge
TemporalData
intoHeteroData
.Currently, one can perform time-related modification/query using a specific node or edge type:
This is possible because
BaseStorage
implements those functionalities. For better usability, however,these functions should be exposed in the
HeteroData
API directly. Such an approach would also allow us toperform operations on many node or edge types at once:
Related: #3230, #8454