Replace NeighborSampler with NeighborLoader in mag240m #382

yanbing-j · 2022-09-21T08:31:47Z

Currently, this PR is a draft PR that contains many print log.

rusty1s · 2022-09-21T08:53:03Z

ogb/lsc/mag240m.py

+        path = osp.join(self.dir, 'processed', 'paper', 'node_label.npy')
+        data["paper"].y = torch.from_numpy(np.load(path))
+        path = osp.join(self.dir, 'processed', 'paper', 'node_year.npy')
+        data["paper"].year = torch.from_numpy(np.load(path, mmap_mode='r'))


We would need to add data['author'].num_nodes = ... and data['institution'].num_nodes = ... to register them as node types.

data['author'].num_nodes = self.__meta__['author']
data['institution'].num_nodes = self.__meta__['institution']
I add these two lines to register author and institution as node types. And the RuntimeError is stil there.

rusty1s · 2022-09-21T08:53:26Z

ogb/lsc/mag240m.py

+    def to_pyg_hetero_data(self):
+        data = HeteroData()
+        path = osp.join(self.dir, 'processed', 'paper', 'node_feat.npy')
+        # Current is not in-memory


Do you mean:

```suggestion # Currently in-memory only

data["paper"].x = torch.from_numpy(np.load(path, mmap_mode='r')) is from @property def paper_label(self)..., which is called when self.in_memory is False. So I comment here, to remind myself to enable in_memory part.

rusty1s · 2022-09-21T08:53:45Z

ogb/lsc/mag240m.py

-        name = f'{src}___{rel}___{dst}'
-        path = osp.join(self.dir, 'processed', name, 'edge_index.npy')
-        return np.load(path)
+    # def edge_index(self, id1: str, id2: str,


Uncomment back in?

This function edge_index is no need any more. The edge_index info can be found in data[(('author', 'writes', 'paper'))].edge_index, data[('author', 'affiliated_with', 'institution')].edge_index and data[('paper', 'cites', 'paper')].edge_index, right?

rusty1s · 2022-09-21T08:54:09Z

ogb/lsc/mag240m.py

@@ -163,7 +183,8 @@ def save_test_submission(self, input_dict: Dict, dir_path: str, mode: str):


 if __name__ == '__main__':
-    dataset = MAG240MDataset()
+    dataset = MAG240MDataset('/home/user/yanbing/pyg/ogb/ogb/lsc/dataset')
+    data = dataset.to_pyg_hetero_data()


Let's test this separately?

/home/user/yanbing/pyg/ogb/ogb/lsc/dataset is the dev root, will remove it.

rusty1s · 2022-09-21T08:57:58Z

examples/lsc/mag240m/gnn.py

-            adjs_t=[adj_t.to(*args, **kwargs) for adj_t in self.adjs_t],
-        )
-
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

 class MAG240M(LightningDataModule):


We could try to make use of torch_geometric.data.LightningNodeDataset for this. This would simplify the construction of neighbor loaders.

Sorry. There is no LightningNodeDataset in pyg.

You mean LightningNodeData? Will try this.

I have updated the code using LightningNodeData, but it still get the RuntimeError Node conv1__paper1 target conv1.author__writes__paper references nonexistent attribute author__writes__paper of conv1.

puririshi98 · 2023-04-24T18:56:06Z

@yanbing-j if not opposed I can take this over when I find time in the next few weeks and finish this PR as it is needed for my work

yanbing-j · 2023-04-25T01:47:07Z

@puririshi98 Sure. Please go ahead.

yanbing-j added 3 commits September 19, 2022 15:36

Enable

71e07f5

MAG240M to HeteroData

1264086

Try convert model to_hetero

10ee86a

rusty1s reviewed Sep 21, 2022

View reviewed changes

yanbing-j added 9 commits September 22, 2022 10:52

Add author and institution as node types

cfdf217

Use LightningNodeData

7480eea

Add inst.npy

1232606

Add author.npy

70f149a

Add 3 nodes and remove relu/dropout in model

e93c89c

add edges

fd26879

reverse edge types

2dfa0fb

reverse edge types and convert y to long

a842429

Use trainer.predict to run inference

d6d0fd0

yanbing-j force-pushed the yanbing/enable branch from 5e5d0d7 to d6d0fd0 Compare November 30, 2022 07:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace NeighborSampler with NeighborLoader in mag240m #382

Replace NeighborSampler with NeighborLoader in mag240m #382

yanbing-j commented Sep 21, 2022

rusty1s Sep 21, 2022

yanbing-j Sep 22, 2022 •

edited

rusty1s Sep 21, 2022

yanbing-j Sep 22, 2022

rusty1s Sep 21, 2022

yanbing-j Sep 22, 2022

rusty1s Sep 21, 2022

yanbing-j Sep 22, 2022

rusty1s Sep 21, 2022

yanbing-j Sep 22, 2022

yanbing-j Sep 22, 2022

yanbing-j Sep 26, 2022

puririshi98 commented Apr 24, 2023

yanbing-j commented Apr 25, 2023

Replace NeighborSampler with NeighborLoader in mag240m #382

Are you sure you want to change the base?

Replace NeighborSampler with NeighborLoader in mag240m #382

Conversation

yanbing-j commented Sep 21, 2022

Choose a reason for hiding this comment

yanbing-j Sep 22, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

puririshi98 commented Apr 24, 2023

yanbing-j commented Apr 25, 2023

yanbing-j Sep 22, 2022 •

edited