Skip to content

songlab-cal/CPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CPT

Cross-protein transfer learning for variant effect prediction

This repository contains the codes and data for reproducing main results from the manuscript "Cross-protein transfer learning substantially improves zero-shot prediction of disease variant effects".

analysis.ipynb: Jupyter notebook for the main analyses.

CPT/: Python files for models and utility functions.

data/: Data necessary to train and evaluate the models.

We also provide pre-computed CPT-1 scores for 18,602 human proteins at

  1. Zenodo
  2. Huggingface (an interactive app to visualize and download individual proteins)

If the user would like to generate whole-proteome predictions with the trained model by themselves, the feature matrices can be downloaded at: EVE set, no-EVE set.

Citation

Jagota, M.*, Ye, C.*, Albors, C., Rastogi, R., Koehl, A., Ioannidis, N., and Song, Y.S.†
"Cross-protein transfer learning substantially improves disease variant prediction", Genome Biology, 24, Article Number: 182 (2023).

*These authors contributed equally to this work.
†To whom correspondence should be addressed: yss@berkeley.edu

DOI: https://doi.org/10.1186/s13059-023-03024-6

About

Cross-protein transfer learning for variant effect prediction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published