Skip to content

mertkosan/GCFExplainer

Repository files navigation

Global Counterfactual Explainer for Graph Neural Networks

This repository is a reference implementation of the global graph counterfactual explainer as described in the paper:

Global Counterfactual Explainer for Graph Neural Networks.
Mert Kosan*, Zexi Huang*, Sourav Medya, Sayan Ranu, Ambuj Singh.
ACM International Conference on Web Search and Data Mining, 2023.
https://dl.acm.org/doi/10.1145/3539597.3570376

The link contains the manuscript and the presentation video in supplement.

Requirements

The easiest way to install the dependencies is via conda. Once you have conda installed, run this command:

conda env create -f environment.yml

If you want to install dependencies manually, we tested our code in Python 3.8.0 using the following main dependencies:

All our experiments are run on a machine with 2 NVIDIA GeForce RTX 2080 GPU (8GB of RAM) and 32 Intel Xeon CPUs (2.10GHz and 128GB of RAM).

Generating Base Models

We have already provided gnn and neurosed base models. If you want to run our method using your own dataset, firstly, you have to train your own gnn and neurosed base models.

  • For gnn base models, you can use our gnn.py module.
  • For neurosed base models, please follow neurosed repository.

If neurosed model is hard to train, you will have to update our importance function to use your graph edit distance function.

Generating Counterfactual Candidates

To generate counterfactual candidates for AIDS dataset with the default hyperparameters, run this command:

python vrrw.py --dataset aids

The counterfactual candidates and meta information is saved under results/{dataset}/runs/. You can check other available training options with:

python vrrw.py --help

Generating Summary Counterfactuals

To generate counterfactual summary set for AIDS dataset from the candidates with the default hyperparameters, run this command:

python summary.py --dataset aids

The coverage and cost performance under different number of summary size will be printed on screen. You can check other available summary options with:

python summary.py --help

Coverage and Cost Performance

The following table shows recourse coverage (𝜃 = 0.1) and median recourse cost comparison between GCFExplainer and baselines for a 10-graph global explanation. GCFExplainer consistently and significantly outperforms all baselines across different datasets.

GCFExplainer Coverage Cost

To reproduce the results for GCFExplainer in the table, run the following script for each dataset and collect the performance corresponding to the top-10 explantions:

python summary.py --dataset {dataset}

Case Study on AIDS dataset

The following figure illustrates global and local counterfactual explanations for the AIDS dataset. The global counterfactual graph (c) presents a high-level recourse rule—changing ketones and ethers into aldehydes (shown in blue)—to combat HIV, while the edge removals (shown in red) recommended by local counterfactual examples from baselines (b) are hard to generalize.

Citing

If you find our framework useful, please consider citing the following paper:

@inproceedings{gcfexplainer2023,
author = {Kosan, Mert and Huang, Zexi and Medya, Sourav and Ranu, Sayan and Singh, Ambuj},
 title = {Global Counterfactual Explainer for Graph Neural Networks},
 booktitle = {WSDM},
 year = {2023}
}