Skip to content
/ SVEN Public

Quantifying the regulatory potential of genetic variants via a hybrid sequence-oriented model

License

Notifications You must be signed in to change notification settings

gao-lab/SVEN

Repository files navigation

SVEN

This repository contains code for SVEN, a multi-modality sequence-oriented in silico model, for quantifying genetic variants' regulatory impacts in over 350 tissues and cell lines.

The SVEN framework is described in the following manuscript: Yu Wang, Nan Liang and Ge Gao, Quantify genetic variants' regulatory potential via a hybrid sequence-oriented model, bioRxiv (2024).

Installation

Important Note: now we provide two modes for prediction: Full mode and Fast mode. For Full mode, you need download ~380G dependent model parameters files; while for Fast mode, you only need to download ~2G dependent model parameters files, with negligible precision loss. For reproducing results from our manuscript, please use Full mode.

Clone the repository then download and extract necessary resource files:

git clone https://github.com/gao-lab/SVEN.git
cd SVEN

# Download and extract resources and model parameters, default for fast mode
sh download_resources.sh
# for full mode
sh download_resources.sh -m full

Install python (3.8), install TensorFlow (v2.5.0) following instructions from https://www.tensorflow.org/ and bedtools from https://bedtools.readthedocs.io/. Use pip install -r requirements.txt to install the other dependencies.

Usage

This is a quick guide for usage, the full guideline is coming soon.

# One-hot encoding
python prepare_data.py ./example/test.bed

# Get functional annotations with CPUs in fast mode
python get_annotations.py
# OR Get functional annotations with GPU 0 in fast mode
python get_annotations.py --gpu 0

# Transform annotations
python transform_annotations.py

# Predict gene expression
python predict_expression.py ./test.exp.predict.txt # with all models
python predict_expression.py ./test.exp.predict.txt --target_idx 3 # with target model

Contact

Yu Wang: wangy@mail.cbi.pku.edu.cn

About

Quantifying the regulatory potential of genetic variants via a hybrid sequence-oriented model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published