BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

News!!!

5/1/2024 - The Myket Dataset used in paper (https://arxiv.org/abs/2308.06862) has been added to BenchTemp. Data preprocess code at preprocess/Myket.py
2/9/2023 - All datasets have been hosted on the open-source platform zenodo (https://zenodo.org/) with a Digital Object Identifier (DOI) 10.5281/zenodo.8267846 (https://zenodo.org/record/8267846).
20/8/2023 - We have added four large-scale datasets (eBay-Large, DGraphFin-Large, YouTubeReddit-Large, and Taobao-Large).

12/7/2023 - We have uploaded experimental codes in folder experimental_codes.
25/6/2023 - We have updated BenchTemp website.
24/6/2023 - We have updated the reference of BenchTemp on github.

Overview

BenchTemp is a general Benchmark Python Library for evaluating Temporal Graph Neural Networks (TGNNs) quickly and efficiently on various workloads. BenchTemp provides Benchmark Datasets, and unified pipelines (DataPreprocessor, DataLoader EdgeSampler, Evaluator, EarlyStopMonitor, BenchTempLoss, BenchTempOptimizer, and Leaderboard) for evaluating Temporal Graph Neural Networks on both link prediction task and node classification task.

Paper - BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks.

Datasets - https://zenodo.org/record/8267846
Code - https://github.com/qianghuangwhu/benchtemp
Leaderboards - https://my-website-6gnpiaym0891702b-1257259254.tcloudbaseapp.com/

The source codes for evaluating existing Temporal Graph Neural Networks based on BenchTemp are in folder experimental_codes.

BenchTemp Framework

BenchTemp Pipeline

BenchTemp Modules

Installation

Requirements

Please ensure that you have installed the following dependencies:

numpy >= 1.18.0
pandas >= 1.2.0
sklearn >= 0.20.0

PyPI install

pip install benchtemp

Usage Example

After installing benchtemp PyPI library, you can evaluating your TGNN models on dynamic link prediction task and dynamic node classification task easily and quickly. For example:

Dynamic Link Prediction

benchtemp provides lp.DataLoader, lp.RandEdgeSampler， EarlyStopMonitor, Evaluator for dynamic link prediction task. Users can evaluating their TGNN models by those components provided by benchtemp. See train_link_prediction.py in folder experimental_codes for details.

Example Framework:

# please import our benchtemp library
import benchtemp as bt

# For example, if you are training , you should create a training  RandEdgeSampler based on the training dataset.
data = bt.lp.DataLoader(dataset_path="./data/", dataset_name='mooc')

# dataloader for dynamic link prediction task

node_features, edge_features, full_data, train_data, val_data, test_data, new_node_val_data, new_node_test_data, new_old_node_val_data, new_old_node_test_data, new_new_node_val_data, new_new_node_test_data, unseen_nodes_num = data.load()


train_rand_sampler = bt.lp.RandEdgeSampler(train_data.sources, train_data.destinations)

monitor = bt.EarlyStopMonitor()

# Users' own TGNN models or SOTA TGNN models
model = TGNN(parameters)
...
for epoch in range(args.epochs):
    ...
    # sample an equal amount of negatives to the positive interactions.
    size = len(train_data)
    _, negatives_batch = train_rand_sampler.sample(size)
    ...
    ...
    pre_positive, pre_negative = model(positive_batch,negatives_batch)
    loss = loss_function(pre_positive, pre_negative, labels)
    ...
    ...
    val_ap = model(val_data)
    if monitor.early_stop_check(val_ap):
      break
...

# testing
pre = model(test_data)
results = bt.evaluator(pre, labels)

Dynamic Node Classification

benchtemp provides nc.DataLoader, EarlyStopMonitor, Evaluator for dynamic node classification. See train_node_classification.py in folder experimental_codes for details.

# please import our benchtemp library
import benchtemp as bt

# For example, if you are training , you should create a training  RandEdgeSampler based on the training dataset.
data = bt.nc.DataLoader(dataset_path="./data/", dataset_name='mooc')

# dataloader for dynamic node  classification task

node_features, edge_features, full_data, train_data, val_data, test_data, new_node_val_data, new_node_test_data, new_old_node_val_data, new_old_node_test_data, new_new_node_val_data, new_new_node_test_data, unseen_nodes_num = data.load()


# Users' own TGNN models or SOTA TGNN models
model = TGNN(parameters)
...
for epoch in range(args.epochs):
    ...
    # sample an equal amount of negatives to the positive interactions.
    size = len(train_data)
    ...
    ...
    pre_positive, pre_negative = model(positive_batch,negatives_batch)
    loss = loss_function(pre_positive, pre_negative, labels)
    ...
    ...
    val_ap = model(val_data)
    if monitor.early_stop_check(val_ap):
      break
...

# testing
pre = model(test_data)
results = bt.evaluator(pre, labels)

BenchTemp Reference

DataPreprocessor

The datasets that have been preprocessed by BenchTemp are Here. You can directly download the datasets and then put them into the directory './data'.

In addition, BenchTemp provides DataPreprocessor class for you to preprocess yours TGNNs datasets.

Class:

DataPreprocessor(data_path: str, data_name: str)

Args:

data_path: str - The path of the dataset.
data_name: str - The name of the dataset.

Function:

DataPreprocessor.data_preprocess(bipartite: bool)

Args:

bipartite: bool - Whether the Temporal Graph is a bipartite graph (Heterogeneous or Homogeneous).

Returns:

ml_{data_name}.csv - The csv file of the Temporal Graph. This file have five columns with properties:
- 'u': The id of the user.
- 'i': The id of the item.
- 'ts': The timestamp of the interaction (edge) between the user and the item.
- 'label': The label of the interaction (edge).
- 'idx': The index of the interaction (edge).
ml_{data_name}.npy - The edge features corresponding to the interactions (edges) in the the Temporal Graph..
ml_{data_name}_node.npy - The initialization node features of the Temporal Graph.

Example:

import benchtemp as bt

processor = bt.DataPreprocessor(data_path="./data/", data_name="mooc")
# If the dataset is bipartite graph, i.e. the user (source nodes) and the item (destination nodes) are of the same type.
processor.data_preprocess(bipartite=True)

# If the dataset is non-bipartite graph.
processor.data_preprocess(bipartite=False)

TemporalGraph

The class of a temporal graph. A temporal graph can be represented as an ordered sequence of temporal user-item interactions $I_{r}=(u_{r}, i_{r}, t_{r}, e_{r})$, $0 \leq t_{1} \leq \dots t_{r} \dots \leq T$. The $r$-th interaction $I_{r}$ happens at time $t_{r}$ between user $u_{r}$ and item $i_{r}$ with edge feature $e_{r}$.

Class:

TemporalGraph(sources: numpy.array, destinations: numpy.array, timestamps: numpy.array, edge_idxs: numpy.array, labels: numpy.array)

Args:

sources: numpy.array - Array of sources of Temporal Graph edges.
destinations: numpy.array - Array of destinations of Temporal Graph edges.
timestamps: numpy.array - Array of timestamps of Temporal Graph edges.
edge_idxs: numpy.array - Array of edge IDs of Temporal Graph edges.
labels: numpy.array - Array of labels of Temporal Graphe dges.

Returns:

benchtemp.TemporalGraph. A Temporal Graph.

Example:

import pandas as pd
import numpy as np
import benchtemp as bt


graph_df = pd.read_csv("dataset_path")

sources = graph_df.u.values
destinations = graph_df.i.values
edge_idxs = graph_df.idx.values
labels = graph_df.label.values
timestamps = graph_df.ts.values

# For example, the full Temporal Graph of the dataset is full_data.
full_data = bt.TemporalGraph(sources, destinations, timestamps, edge_idxs, labels)

lp.DataLoader

The DataLoader class for link prediction tasks.

In transductive link prediction, Dataloader splits the temporal graphs chronologically into 70%-15%-15% for train, validation and test sets according to edge timestamps.

In inductive link prediction, Dataloader performs the same split as the transductive setting, and randomly masks 10% nodes as unseen nodes. Any edges associated with these unseen nodes are removed from the training set. To reflect different inductive scenarios, DataLoader further generates three inductive test sets from the transductive test dataset, by filtering edges in different manners:

Inductive - selects edges with at least one unseen node.
Inductive New-Old - selects edges between a seen node and an unseen node.
Inductive New-New - selects edges between two unseen nodes.

Class:

lp.DataLoader(dataset_path: str, dataset_name: str, different_new_nodes_between_val_and_test: bool, randomize_features: bool)

Args:

dataset_path: str - The path of the dataset.
dataset_name: str - The name of dataset.
different_new_nodes_between_val_and_test: bool - The new nodes are between validation set and test set.
randomize_features: str - Random initialization of node features.

Function:

lp.DataLoader.load()

Returns:

node_features: numpy.array - Array of the Node Features of the Temporal Graph.
edge_features: numpy.array - Array of the Edge Features of the Temporal Graph.
full_data: benchtemp.TemporalGraph - Full Temporal Graph dataset.
train_data: benchtemp.TemporalGraph - The training set.
val_data: benchtemp.TemporalGraph - The validation set.
test_data: benchtemp.TemporalGraph - The Transductive test set.
new_node_val_data: benchtemp.TemporalGraph - The Inductive validation set.
new_node_test_data: benchtemp.TemporalGraph - The Inductive test set.
new_old_node_val_data: benchtemp.TemporalGraph - The Inductive New-Old validation set.
new_old_node_test_data: benchtemp.TemporalGraph - The Inductive New-Old test set.
new_new_node_val_data: benchtemp.TemporalGraph - The Inductive New-New validation set.
new_new_node_test_data: benchtemp.TemporalGraph - The Inductive New-New test set.
unseen_nodes_num: int - The number of unseen nodes in inductive setting.

Example:

import benchtemp as bt

data = bt.lp.DataLoader(dataset_path="./data/", dataset_name='mooc')

node_features, edge_features, full_data, train_data, val_data, test_data, new_node_val_data, new_node_test_data, new_old_node_val_data, new_old_node_test_data, new_new_node_val_data, new_new_node_test_data, unseen_nodes_num = data.load()

lp.RandEdgeSampler

BenchTemp provides the unified negative edge sampler class with a seed named RandEdgeSampler for link prediction task to sample an equal amount of negatives to the positive interactions.

Class:

RandEdgeSampler(src_list: numpy.array, dst_list: numpy.array, seed: int)

Args:

src_list: numpy.array - Array of source nodes.
dst_list: numpy.array - Array of destination nodes.
seed: numpy.array - The seed of random.

Function:

RandEdgeSampler.sample(size: int)

Args:

size: int - The size of the sampling negative edges.

Returns:

src_list: numpy.array - Array of source nodes of the sampling negative edges.
dst_list: numpy.array - Array of destination nodes of the sampling negative edges.

Example:

import benchtemp as bt

# For example, if you are training , you should create a training  RandEdgeSampler based on the training dataset.
train_rand_sampler = bt.lp.RandEdgeSampler(train_data.sources, train_data.destinations)

...
for epoch in range(args.epochs):
    ...
    # sample an equal amount of negatives to the positive interactions.
    size = len(train_data)
    _, negatives_batch = train_rand_sampler.sample(size)
    ...
...

nc.DataLoader

The DataLoader class for the node classification task. The DataLoader module sorts edges and splits the input dataset (70%-15%-15%) according to edge timestamps.

Class:

nc.DataLoader(dataset_path: str, dataset_name: str, use_validation: bool)

Args:

dataset_path: str - The path of the dataset.
dataset_name: str - The name of the dataset.
use_validation: bool - Whether use validation dataset or not.

Function:

nc.DataLoader.load()

Returns:

node_features: numpy.array - Array of the Node Features of the Temporal Graph.
edge_features: numpy.array - Array of the Edge Features of the Temporal Graph.
full_data: benchtemp.TemporalGraph - Full Temporal Graph dataset for node classification task.
train_data: benchtemp.TemporalGraph - The training set for node classification task.
val_data: benchtemp.TemporalGraph - The validation set for node classification task.
test_data: benchtemp.TemporalGraph - The test set for node classification task.

Example:

import benchtemp as bt

data = bt.nc.DataLoader(dataset_path="./data/", dataset_name='mooc', use_validation=True)

node_features, edge_features, full_data, train_data, val_data, test_data = data.load()

EarlyStopMonitor

BenchTemp provides a unified EarlyStopMonitor to improve training efficiency and save resources.

Class:

EarlyStopMonitor(max_round: int, higher_better: bool, tolerance: float)

Args:

max_round: int - The number of rounds for early stop.
higher_better: bool - The higher the value, the better the performance.
tolerance: float - The tolerance of the EarlyStopMonitor.

Function:

EarlyStopMonitor.early_stop_check(curr_val:float)

Args:

curr_val: float - The value to check for early stop.

Returns:

True - If the value matches the setting of the EarlyStopMonitor.
False - If the value does not match the setting of the EarlyStopMonitor.

Example:

import benchtemp as bt

...
early_stopper = bt.EarlyStopMonitor(max_round=args.patience)
for epoch in range(args.epochs):
    ...
    val_ap = model(val_datasets)
    if early_stopper.early_stop_check(val_ap):
        break
    ...
...

Evaluator

Different evaluation metrics are available, including Area Under the Receiver Operating Characteristic Curve (ROC AUC) and Average Precision (AP). Usually, metrics Area Under the Receiver Operating Characteristic Curve (ROC AUC) and average precision (AP) are for the link prediction task, while metrics AUC is for the node classification task.

Class:

Evaluator(task_name: str)

Args:

task_name: str - the name of the task, choice in ["LP", "NC"], LP for the link prediction task and NC for the node classification task.

Function:

Evaluator.eval(pred_score: numpy.array, true_label: numpy.array)

Args:

pred_score: numpy.array- Array of prediction scores.
true_label: numpy.array - Array of true labels.

Returns:

AUC: float - the value of the AUC.
AP: float - the value of the AP.

Example:

import benchtemp as bt

# For example, Link prediction task. Evaluation Metrics: AUC, AP.
evaluator = bt.Evaluator("LP")

...
# test data
pred_score = model(test_data)
test_auc, test_ap = evaluator.eval(pred_score, true_label)
...

import benchtemp as bt

# For example, node classification task. Evaluation Metrics: AUC.
evaluator = bt.Evaluator("NC")

...
# test data
pred_score = model(test_data)
test_auc = evaluator.eval(pred_score, true_label)
...

Call for Contributions

BenchTemp project is looking for contributors with expertise and enthusiasm! If you have the desire to contribute to BenchTemp, please contact BenchTemp team. Contributions and issues from the community are eagerly welcomed, with which we can together push forward the TGNN research.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
DGraphFin		DGraphFin
__pycache__		__pycache__
core		core
experimental_codes		experimental_codes
img		img
lp		lp
nc		nc
optimization		optimization
preprocess		preprocess
utils		utils
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
appendix.pdf		appendix.pdf
response.pdf		response.pdf

License

qianghuangwhu/benchtemp

Folders and files

Latest commit

History

Repository files navigation

BenchTemp: A General Benchmark for Evaluating Temporal Graph Neural Networks

Table of Contents

News!!!

Overview

BenchTemp Framework

BenchTemp Pipeline

BenchTemp Modules

Installation

Requirements

PyPI install

Usage Example

Dynamic Link Prediction

Dynamic Node Classification

BenchTemp Reference

DataPreprocessor

TemporalGraph

lp.DataLoader

lp.RandEdgeSampler

nc.DataLoader

EarlyStopMonitor

Evaluator

Call for Contributions

About

Topics

Resources

License

Stars

Watchers

Forks

Languages