Introduction

This is the official implementation of the paper "Causal Discovery in Knowledge Graphs by Exploiting Asymmetric Properties of Non-Gaussian Distributions". We use many existing open source libraries for implementing the method proposed in the paper. Please follow the setup instructions to install the requirements and get the code running.

Setup and Installation

For running the experiments, we created a new environment using Conda, with python version 3.8.5. You can create a new environment in conda using conda create -n <env-name> python=3.8

Once created, you can enter the new environment by conda activate <env-name>

You can then install all the required dependencies by running this command: pip install -r requirements.txt

Training custom embeddings

If you want to train your own tucker embeddings with custom hyperparameters follow these steps, else run the hybrid algorithm.

Head over to pykg2vec and follow the instructions to install the pykg2vec package for training custom embeddings.
cd into the examples folder of the cloned pykg2vec repository.
To run the pykg2vec embedding with the same hyperparameters run the command python train.py -exp True -mn TuckER -ds freebase15k_237 -hpf custom_hp.yaml in the examples folder. This creates the embeddings and stores them in the /datasets/dataset-name/embeddings folder as .tsv files. You might have to include the full path to the custom_hp.yaml file included in this repo.
Depending on the version of pykg2vec you may have to add additional details to the .yaml file located in the site-packages of the conda environment (installation location for pykg2vec).
Once the custom embeddings are trained, you can follow the next steps to execute the algorithm for causal discovery.

Running the hybrid algorithm

Before running the project, check if the required embedding (if custom training is done) are located in the same folder as that of the script.
You can run the hybrid algorithm by running the following command in the project folder where hybrid.py is located. python hybrid.py -dataset fb15k-237 -algorithm DirectLiNGAM -plot True
The output of the above command will be a text file results_hybrid.txt which contains the execution time, the mean p-value and the causal order. It also plots the Directed Acyclic Graph of the causal order output by the algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
fb15k-237		fb15k-237
wn18_rr		wn18_rr
README.md		README.md
custom_hp.yaml		custom_hp.yaml
hybrid.py		hybrid.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fb15k-237

fb15k-237

wn18_rr

wn18_rr

README.md

README.md

custom_hp.yaml

custom_hp.yaml

hybrid.py

hybrid.py

requirements.txt

requirements.txt

Repository files navigation

Introduction

Setup and Installation

Training custom embeddings

Running the hybrid algorithm

About

Releases

Packages

Languages

rohangiriraj/CausalKG

Folders and files

Latest commit

History

Repository files navigation

Introduction

Setup and Installation

Training custom embeddings

Running the hybrid algorithm

About

Resources

Stars

Watchers

Forks

Languages