PairE-Graph-Representation-Learning-Beyond-Node-and-Homophily

"Graph Representation Learning Beyond Node and Homophily"

We provide the source code of paire. Our model implementation is based on keras, which allows the model to be trained by GPU.

Usage

Installation

run the following code
```
pip install -r requirements.txt
```

The source code is saved in the pair-embedding.ipynb file

Example

To run "PairE" on Cora network and evaluate the learned representations on multi-label node classification task, run the following command in the home directory of this project:

Input

The supported input format is an edgelist or an adjlist:

edgelist: node1 node2 
adjlist: node n1 n2 n3 ... nk

The graph is assumed to be undirected and unweighted by default. The model needs additional features, the supported feature input format is as follow (feature_i should be a float number):

node feature_1 feature_2 ... feature_n

Read data:

dataset = 'pubmed'
iG,G,G_label,G_attr = read_data(dataset)
num_classes = G_label['label'].map(lambda x:x[0]).nunique()

Output

The output contains two dataframes: pair_embedding has |E| lines for a graph with |E| edges. Node_embedding has n lines for a graph with n nodes. The |E| lines are as follows:

Pair<u, v> dim1 dim2 … dimd

The n lines are as follows:

node_id dim1 dim2 ... dimd

where dim1, ... , dimd is the d-dimensional representation learned by PairE.

Train model :

model = PairEmbedding(G,G_attr,G_label,embedding_size=128,epochs=30,silent=False)
node_embedding, edge_embedding, MultiTask_SelfSupervision_AE = model.fit()

Evaluation

If you want to evaluate the learned node representations, you can input the node labels. It will use a portion of nodes(default:10%、30%、50%、70%、90%) to train a classifier and calculate F1-score on the rest dataset.

The supported input label format is

node label1 label2 label3...

evaluate on multi-class node classfication:

model.evaluate(clf=[LogisticRegression(n_jobs=-1)])

Embedding visualization

We apply the dimensionality reduction method like t-SNE to the embedded visualization, and visualize the embedding of different data sets, where the colors of nodes represent the labels of nodes.

plot_embedding(node_embedding, G_label, dataset)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
pair-embedding.ipynb		pair-embedding.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pair-embedding.ipynb

pair-embedding.ipynb

requirements.txt

requirements.txt

Repository files navigation

PairE-Graph-Representation-Learning-Beyond-Node-and-Homophily

Usage

Installation

Example

Input

Read data:

Output

Train model :

Evaluation

evaluate on multi-class node classfication:

Embedding visualization

About

Releases

Packages

Languages

syvail/PairE-Graph-Representation-Learning-Beyond-Node-and-Homophily

Folders and files

Latest commit

History

Repository files navigation

PairE-Graph-Representation-Learning-Beyond-Node-and-Homophily

Usage

Installation

Example

Input

Read data:

Output

Train model :

Evaluation

evaluate on multi-class node classfication:

Embedding visualization

About

Resources

Stars

Watchers

Forks

Languages