This repository contains the code and the data used for the experiments in the paper "Rule Induction in Knowledge Graphs Using Linear Programming" by Sanjeeb Dash and Joao Goncalves, AAAI-23.
- The code was tested only on Linux.
- The code is written in C++ and requires a C++ compiler.
- The code uses the commercial solver IBM ILOG CPLEX.
code: contains the code and the Makefile.data: contains the 5 datasets used in the paper.runs: contains the parameters files and scripts to run the 5 datasets.
- Install IBM ILOG CPLEX.
- Edit the file
Makefilein the directorycodeand add the paths to the directorycplexandconcertin the lines:CPLEXDIR = path_to_cplex/cplexCONCERTDIR = path_to_cplex/concert
- In the directory
codetypemake.
The following instructions show how to run LPRules with the dataset UMLS.
- Go to the directory
run/UMLS. - Execute the command:
./run_lprules_in_parallel.sh p_UMLS.txt outUMLS 46wherep_UMLS.txtis the parameter file residing in the directoryrun/UMLS, outUMLS is the name to be used in output files and 46 is the number of relations in the UMLS dataset. - The results are presented at the end of the file
results_outUMLS.txt.
The number of relations in each dataset is: UMLS 46, Kinship 25, WN18RR 11, FB15k-237 237, YAGO3-10 37.
The following instructions show how to run LPRules with the dataset UMLS
assuming the existence of a set of rules in the file input_rules.txt.
- Go to the directory
run/UMLS. - Edit the file
p_UMLS.txtand set the parameterrun_modeto either 1, 2, or 3. The meaning of each of these values is: 1 - scenario B in the paper, 2 - scenario C in the paper, 3 - scenario D in the paper. - Execute the command:
./run_lprules_in_parallel_read_rules.sh p_UMLS.txt outUMLS 46 input_rules.txtwherep_UMLS.txtis the parameter file residing in the directoryrun/UMLS, outUMLS is the name to be used in output files and 46 is the number of relations in the UMLS dataset. - The results are presented at the end of the file
results_outUMLS.txt.