Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge TemporalData with Data/HeteroData [2/n] #8739

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

DamianSzwichtenberg
Copy link
Member

This PR attempts to merge TemporalData into HeteroData.
Currently, one can perform time-related modification/query using a specific node or edge type:

# bad: usage based on `BaseStorage` methods
for e in hetero_data.edge_types:
    hetero_data[e].up_to(10)
res = all([hetero_data[e].is_sorted_by_time() for e in hetero_data.edge_types])

This is possible because BaseStorage implements those functionalities. For better usability, however,
these functions should be exposed in the HeteroData API directly. Such an approach would also allow us to
perform operations on many node or edge types at once:

# good: usage based on `HeteroData` API
out = hetero_data.up_to(hetero_data.edge_types, 10)
res = hetero_data.is_sorted_by_time(hetero_data.edge_types)

Related: #3230, #8454

Copy link

codecov bot commented Jan 8, 2024

Codecov Report

Attention: Patch coverage is 86.84211% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 89.23%. Comparing base (af0f5f4) to head (5dcdc0b).

Files Patch % Lines
torch_geometric/data/hetero_data.py 85.71% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8739      +/-   ##
==========================================
- Coverage   89.86%   89.23%   -0.63%     
==========================================
  Files         468      468              
  Lines       30039    30070      +31     
==========================================
- Hits        26995    26834     -161     
- Misses       3044     3236     +192     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

torch_geometric/data/hetero_data.py Show resolved Hide resolved
torch_geometric/data/hetero_data.py Outdated Show resolved Hide resolved
out[t].sort_by_time()
return out

def snapshot(self, types: Union[List[EdgeType], List[NodeType]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about snapshot which is relevant to this PR and #8454.
Assume nodes n1 and n2 with time equals 1 and 4 respectively, and an edge e1 (between n1 and n2 ) with time = 5.
On calling snapshot(start_time = 4, end_time = 5) what would the output look like, from the code it looks like the returned graph will have node n2 and edge e1, but e1 doens't have a source node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current design, time cannot be an edge and node attribute at the same time, so this scenario is impossible. As these functionalities are moved directly from TemporalData I would consider underlying data rather as a list of events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some part of the code that explicitly prevents this?
Does this also apply to hetero graphs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2x Yes, for example at BaseStorage and NeighborSampler.

out[t].snapshot(start_time, end_time)
return out

def up_to(self, types: Union[List[EdgeType], List[NodeType]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both up_to and snapshot , do we reindex edges. Like if we remove the 0th node from the graph, then does the 1st node become the 0th node in the edge index?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't. But that's a good point. When I integrated TemporalData into Data I kept the assumption that we represent events, so time as a node attribute is currently not well defined.

torch_geometric/data/hetero_data.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants