Skip to content

fnothaft/gnocchi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gnocchi

Demo code showing how one could possibly do genotype-phenotype analysis using ADAM. This is work-in-progress. Currently, we implement a simple case/control analysis using a Chi squared test.

Build

To build, install Maven. Then run:

mvn package

Maven will automatically pull down and install all of the necessary dependencies. Occasionally, building in Maven will fail due to memory issues. You can work around this by setting the MAVEN_OPTS environment variable to -Xmx2g -XX:MaxPermSize=1g.

Run

To run, you'll need to install Spark. If you are just evaluating locally, you can use a prebuilt Spark distribution. If you'd like to use a cluster, refer to Spark's cluster overview.

Once Spark is installed, set the environment variable SPARK_HOME to point to the Spark installation root directory. Then, you can run gnocchi via ./bin/gnocchi-submit.

We include test data. You can run with the test data by running:

./bin/gnocchi-submit regressPhenotypes testData/sample.vcf testData/samplePhenotypes.csv testData/associations -saveAsText

Phenotype Input

We accept phenotype inputs in a CSV format:

Sample,Phenotype,Has Phenotype
mySample,a phenotype,true

The has phenotype column is binary true/false. See the test data for more descriptions.

License

This project is released under an Apache 2.0 license.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published