Skip to content

Commit

Permalink
updated README, restored clinvar_alleles_example_750_rows.*.tsv sampl…
Browse files Browse the repository at this point in the history
…e output files
  • Loading branch information
bw2 committed Mar 13, 2017
1 parent 5b04ade commit 49caff9
Show file tree
Hide file tree
Showing 7 changed files with 537 additions and 533 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ src/last.dump.rda

### Heavy Data Files ###
src/pipeline_output
src/output_tmp
src/logs
*.old.*
ClinVarFullRelease*
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ Each of these directories contains the following files:
* __clinvar_alleles.*.tsv.gz__: table where each row represents a single variant allele. This is generated by grouping _clinvar_allele_trait_pairs.tsv.gz_ by allele.
* __clinvar_alleles.*.vcf.gz__: _clinvar_alleles.tsv.gz_ converted to VCF format.
* __clinvar_alleles_with_exac_v1.*.tsv.gz__: _clinvar_alleles.tsv.*.gz_ with additonal columns from the [ExAC v1](http://exac.broadinstitute.org/about) dataset that have non-empty values for all clinvar alleles that are also present in ExAC.
* __clinvar_alleles_with_gnomad_exomes.*.tsv.gz__: _clinvar_alleles.tsv.*.gz_ with additonal columns from the [gnomAD](http://gnomad.broadinstitute.org/about) dataset that have non-empty values for all clinvar alleles that are also present in the gnomAD exomes callset.
* __clinvar_alleles_with_gnomad_genomes.*.tsv.gz__: _clinvar_alleles.tsv.*.gz_ with additonal columns from the [gnomAD](http://gnomad.broadinstitute.org/about) dataset that have non-empty values for all clinvar alleles that are also present in the gnomAD genomes callset.
* __clinvar_alleles_stats.*.txt__: summary of the different columns in _clinvar_alleles.*.tsv.gz_, along with some basic stats on the different values that appear in each column.


Expand Down Expand Up @@ -66,7 +68,7 @@ To run the pipeline:
```
cd ./src
pip install --user --upgrade -r requirements.txt
python2.7 master.py --b37-genome /path/to/b37.fasta --b38-genome /path/to/b38.fasta -E /path/to/ExAC.r1.sites.vep.vcf.gz
python2.7 master.py --b37-genome /path/to/b37.fa --b38-genome /path/to/b38.fa -E /path/to/ExAC.r1.sites.vep.vcf.gz -GG /path/to/gnomad.genomes.r2.0.1.sites.coding.autosomes_and_X.vcf.gz -GE /path/to/gnomad.exomes.r2.0.1.sites.vcf.gz
```

See `python master.py -h` for additional options.
Expand Down
182 changes: 91 additions & 91 deletions output/b37/multi/clinvar_alleles_example_750_rows.multi.b37.tsv

Large diffs are not rendered by default.

350 changes: 175 additions & 175 deletions output/b37/single/clinvar_alleles_example_750_rows.single.b37.tsv

Large diffs are not rendered by default.

182 changes: 91 additions & 91 deletions output/b38/multi/clinvar_alleles_example_750_rows.multi.b38.tsv

Large diffs are not rendered by default.

350 changes: 175 additions & 175 deletions output/b38/single/clinvar_alleles_example_750_rows.single.b38.tsv

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions src/master.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ def download_if_changed(job_runner, local_path, ftp_host, ftp_path):

# create uncompressed example files that contain the 1st 750 lines of the compressed tsvs so people can easily see typical values online on github
job.add("gunzip -c IN:%(tmp_dir)s/clinvar_allele_trait_pairs.%(fsuffix)s.tsv.gz | head -n 750 > OUT:%(output_dir)s/clinvar_allele_trait_pairs_example_750_rows.%(fsuffix)s.tsv" % locals())
job.add("gunzip -c IN:%(tmp_dir)s/clinvar_alleles.%(fsuffix)s.tsv.gz | head -n 750 > OUT:%(output_dir)s/clinvar_alleles_example_750_rows.%(fsuffix)s.tsv" % locals())
job.add("gunzip -c IN:%(tmp_dir)s/clinvar_alleles.%(fsuffix)s.vcf.gz | head -n 750 > OUT:%(output_dir)s/clinvar_alleles_example_750_rows.%(fsuffix)s.vcf" % locals())

# create tsv table with extra fields from ExAC: filter, ac_adj, an_adj, popmax_ac, popmax_an, popmax
Expand Down

0 comments on commit 49caff9

Please sign in to comment.