This repository provides the R package implementing part of the analyses presented in:
- Sarkar, A. K., Ward, L. D., & Kellis, M. (2016). Functional enrichments of disease variants across thousands of independent loci in eight diseases. bioRxiv. http://dx.doi.org/10.1101/048066
The Python package is available from http://www.github.com/aksarkar/frea. The computational pipeline described in the text (which utilizes these packages) is available from http://www.github.com/aksarkar/frea-pipeline
R <<EOF
devtools::install_github('aksarkar/frea')
EOF
The R package requires:
- R > 3.1
- Cairo
- devtools
- dplyr
- ggplot2
- gtable
- plyr
- reshape2
- scales
The design of the packages is based on several ideas, which are dependent on the characteristics of the compute environment they were developed in (Univa Grid Engine, relatively strict memory limits, but many compute nodes):
- Use independent Python processes to distribute work in massively parallel fashion across compute nodes (using mechanisms outside of Python such as GNU parallel)
- Use streaming algorithms wherever possible, building as few intermediate data structures as needed
- Invoke modules as scripts (python -m) for entry points wherever possible
- Use R to produce visualizations