This repository contains the source code for Parameterized algorithms for identifying gene co-expression modules via weighted clique decomposition.
numpy, pandas, networkx, scipy, gurobi
-
From /cricca direcotory, run
python src,main.py --graph_filename [your_fname.txt] --algorithm [either 'wecp', 'ipart', or 'lp'] --parameter [integer parameter value]
-
For information on other input arguments, run
python src/main.py --help
Steps to generate the same corpus used in the paper experiments (see paper for additional graph generation details.):
-
Download Transcription Factor Base Data: TF Data
-
Open Catalog.xls, save sheet 'TFactS_sign_less_version2' as 'gene_TF_dat.csv' in generator_dat directory.
-
Download Latent Variable Base Data (Note file is 84GB): LV Data
-
Place the files named 'multiplier_model_z.tsv' and 'multiplier_model_summary.tsv' in generator_dat directory.
-
From /cricca directory, run
python src/graph_gen/gen_acda21_corpus.py
Both TF and LV graphs will now exist in the src/data/pre_preprocessing directory.
To preprocess each graph, from /cricca directory, run:
python src/exp_preprocess.py -f 0 -l 19
-f/--first_seed and -l/--last_seed give the seeds of desired graphs to preprocess.
To kernelize each preprocessed graph with k [2-11] (after pre-processing), from /cricca directory, run:
python src/exp_runkernel.py -f 0 -l 19
To run each matrix decomposition algorithm including wecp (original algorithm by Feldmann et al. 2020), ipart (integer partitioning based), and lp (linear programming based) algorithms, from /cricca directory, run:
python src/exp_runbsd.py -f 0 -l 19
Note: timeout code in function run_bsd() in file exp_runbsd.py only works on unix systems (e.g. signal.signal/alarm(....) function calls).
To combine all data saved in pickle files (after running exp_runbsd.py) into a single csv file, from /cricca directory, run:
python src/tocsv.py
To run the memory tests comparing wecp/ipart/lp algorithms, from /cricca directory, run:
python src/exp_runmemtests.py