Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for GWAS summary TSV files #142

Open
darked89 opened this issue Jul 12, 2021 · 1 comment
Open

support for GWAS summary TSV files #142

darked89 opened this issue Jul 12, 2021 · 1 comment

Comments

@darked89
Copy link

Hello,

The majority of summary stats from various studies & biobanks are available as bgzip-ed TSVs. For converting these to VCF one can use gwas2vcf (https://github.com/MRCIEU/gwas2vcf) but it supports at this point just a set of input columns. Extending it beyond that does not look like a trivial task at lest to me.

Putting these "extra" columns from TSV to gwas2vcf produced VCF is something which can be done using bcftools annotate, but
this looks a much less flexible process than vcfanno. I am positive than rather sooner than later I will have to not just copy some value from the TSV and "paste" it into VCF but modify it on the fly.

Hence my questions:

  1. would it be possible to enhance vcfanno to handle at least the "well behaved" GWAS TSV files as an annotation source?
    For example the TSV format described here: https://finngen.gitbook.io/documentation/data-description

  2. In a meantime, can vcfanno use BED-VCF-like format derived from above Finngen's TSV with canonical first 3 BED columns plus REF & ALT

22      100000          100000      A     T 

followed by either all the remaining columns from the TSV input or just the "extras" not present already already in the gwas2vcf produced?

The ALT and REF are needed, since the input TSV has some rows with things like:

22      100000          A     T       bunch_of_columns_here
22      100000          A     G      bunch_of_columns_here
22      100000          A     CG

Thank you,

Darek Kedra

@brentp
Copy link
Owner

brentp commented Jul 12, 2021

Yes, this is already possible. As long as you can bgzip and tabix it, then vcfanno can use it. You will also need a header that indicates "ref" and "alt". You can do that with, e.g. #chrom\tstart\tstop\tref\alt\t.. for your example above.

Let me know if this answers your question.
-Brent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants