Skip to content

cbellei/LabelGCN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Label-GCN

A variation of GCN that allows the model to learn from labelled nodes in a graph. Paper available at https://arxiv.org/abs/2104.02153.

Overview

The implementation of Label-GCN relies on the functionality available through Keras and Stellargraph. The Stellargraph library has been modified, with the main logic of Label-GCN contained in the file label_gcn.py located at stellargraph/layer/label_gcn.py. This modified Stellargraph library is available at https://github.com/cbellei/stellargraph and is used as a submodule in this project (see next section).

Unfortunately, Tensorflow does not easily support the use of sparse tensors within the tf.Linalg package (see here tensorflow/tensorflow#27380 for some details); this has the effect that currently the implementation provided in this project is inefficient for large graphs (such as the Elliptic dataset).

Installation

  • Tested with Python 3.6
  • Clone the repository and add the Stellargraph submodule, modified with the addition of Label-GCN
git clone https://github.com/cbellei/LabelGCN.git
cd LabelGCN
git submodule init
git submodule update
  • Set up the environment (Anaconda):
conda create -n LabelGCN python=3.6
conda activate LabelGCN
pip install -r requirements.txt
cd stellargraph
pip install -e .

Datasets

The CORA, Citeseer and Pubmed datasets are available through the Stellargraph library. The Elliptic dataset is available at https://www.kaggle.com/ellipticco/elliptic-data-set. This project expects the Elliptic dataset to be located under a directory named elliptic_bitcoin_dataset.

Running the experiments

The transductive experiments of Tables 3 and 4 can be produced running the file experiments_transductive.py. The dataset, number of random states and number of runs for each random state are set by the flags ds, ns and nr respectively. It is advisable to run with ds=cora, ns=1 and nr=1 for the quickest run. This would result in the following command:

python src/experiments_transductive.py -ds cora -ns 1 -nr 1

The inductive experiment of Table 5 can be produced running the file experiments_inductive.py. Three flags can be set in this case: ns determines the number of random states, as before, while nr1 and nr2 the number of runs for the standalone classical machine learning models and for the classical machine learning models with the addition of the GCN/LabelGCN embeddings. In the latter case, for each set of embeddings (set via ns), a number of runs nr2 is performed. For this experiment, the quickest run results from the command:

python src/experiments_inductive.py -ns 1 -nr1 1 -nr2 1

NOTE: All runs involving the Elliptic dataset are computationally demanding. Producing Tables 4 and 5 of the paper required running on a server for several hours.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages