Skip to content

S PrediXcan Command Line Usage Examples

Alvaro Barbeira edited this page Feb 28, 2020 · 1 revision

Metaxcan command line tool needs users to enter information concerning the format of the GWAS input files.

Here are some examples for common GWAS data. This commands are incomplete in the sense that they still lack the selection of prediction model and covariance; they are concerned with exemplifying how to process a GWAS file.

Bear in mind that MetaXcan needs to figure out the Z-score of GWAS betas; so, in most cases, it would need a p-value and either of the following: beta, beta sign, or odd ratio. Some GWAS files already include the ZScore, so only that is needed.

Plink assoc files

These GWAS files are gzip compressed and have a header and look similar to:

         SNP  A1  A2     FRQ    INFO    BETA      SE       P
    rs940550   C   T  0.1522  0.9890 2985.6704 2857.8777   0.299
   rs6650104   A   G  0.0000  0.0000      NA      NA      NA
...
...

In order for MetaXcan to be able to parse the file, a user needs to specify BETA columns and P columns. The arguments

$ ./MetaXcan.py --gwas_folder /path/to/my/files --gwas_file_pattern ".*assoc*" \
--model_db_path my_model.db --covariance my_covariance.txt.gz \
--output_file /path/to/output \
#GWAS format parameters 
--effect_allele_column A1 \
--non_effect_allele_column A2 \
--beta_column BETA --pvalue_column P  \

MAGIC consortium.

Suppose that a user has a GWAS set of gzipped files like Magic Consortium's.

snp     effect_allele   other_allele    maf     effect  stderr  pvalue
rs10    a       c       .123    1.23e-03 1.23e-02 .9999
....

Then, a user would have to set:

$ ./MetaXcan.py --gwas_folder /path/to/my/files --gwas_file_pattern ".*gz"   \
--model_db_path my_model.db --covariance my_covariance.txt.gz \
--output_file /path/to/output \
#GWAS format parameters
--beta_column effect \
--pvalue_column pvalue \
--snp_column snp \
--effect_allele_column effect_allele \
--non_effect_allele_column other_allele 

Example CSV

Suppose now that a user has plain CSV files with the following information:

OrigSNPname,Chromosome,Position,Effect,Baseline,OR,SE,pvalue
rs58108140,1,10583,A,G,0.96099,0.08405,0.5265
...

Then, to parse the files, the following would have to be set:

$ ./MetaXcan.py --gwas_folder /path/to/my/files --gwas_file_pattern ".*csv" \
--model_db_path my_model.db --covariance my_covariance.txt.gz \
--output_file /path/to/output \
#GWAS format parameters
--or_column OR \
--pvalue_column pvalue \
--snp_column OrigSNPName \
--non_effect_allele_column Baseline \
--effect_allele_column Effect \
--separator ,