Skip to content

dfdazac/dgi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Graph Infomax

Deep Graph Infomax (DGI) is an unsupervised algorithm for finding representations of graphs that can be used in downstream tasks like node classification.

This is a TensorFlow implementation of DGI, based on the Graph Convolutional Network implementation by Thomas Kipf.

Installation

python setup.py install

Requirements

  • tensorflow (>0.12)
  • networkx

Run

First train a DGI model:

python train.py --model dgi

Once the model is trained, the graph embeddings are saved as a pickle file in the runs folder. Take note of its path (e.g. runs/2018-11-04-164053/embeddings.p and use it to train a logistic regression model on the node classification task:

python train.py --model logreg --embeddings_path runs/2018-11-04-164053/embeddings.p

Data

In order to use your own data, you have to provide

  • an N by N adjacency matrix (N is the number of nodes),
  • an N by D feature matrix (D is the number of features per node), and
  • an N by E binary label matrix (E is the number of classes).

Have a look at the load_data() function in utils.py for an example.

In this example, we load citation network data (Cora, Citeseer or Pubmed). The original datasets can be found here: http://linqs.cs.umd.edu/projects/projects/lbc/. In our version (see data folder) we use dataset splits provided by https://github.com/kimiyoung/planetoid (Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov, Revisiting Semi-Supervised Learning with Graph Embeddings, ICML 2016).

You can specify a dataset as follows:

python train.py --dataset citeseer

(or by editing train.py)

Models

You can choose between the following models:

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages