Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCF2MAF and allele frequency #344

Open
jmf432x opened this issue May 12, 2023 · 4 comments
Open

VCF2MAF and allele frequency #344

jmf432x opened this issue May 12, 2023 · 4 comments

Comments

@jmf432x
Copy link

jmf432x commented May 12, 2023

I am converting a VCF to MAF using the VCF2MAF program, and importing that into cbioportal. I notice that the "af" from the INFO field is not making to the MAF output file.

Snippet from VCF: (the "af" value is no where to be found in maf file)
##INFO=<ID=af,Number=1,Type=Float,Description="Alternate allele frequency">
#CHROM POS ID REF ALT QUAL FILTER INFO
chr17 27419372 . C A . PASS af=0.31410256;cds_syntax=5176G>T;cosmic_status=UNKNOWN;depth=468;effect=MISSENSE;gene_name=MYO18A;protein_syntax=A1726S;transcript_name=NM_078471

Perl VCF2MAF command:
perl vcf2maf.pl --input-vcf /home/jmf432/test_vcf_files/ORD-1543927-02.vcf --output-maf /home/jmf432/test_vcf_files/ORD-1543927-02-JF2.maf --ref-fasta /home/jmf432/.vep/homo_sapiens/109_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz --vep-path /home/jmf432/vep/ensembl-vep-release-109.3 --ncbi-build GRCh37 --verbose --vep-overwrite --any-allele --retain-info af

Snippet from the output MAF file (all the af fields are blank):

DOMAINS AF AFR_AF AMR_AF ASN_AF EAS_AF EUR_AF SAS_AF AA_AF EA_AF

I was expecting to see the "af" value from the VCF file in the MAF output file, but it is not.

Can someone please explain why and correct me if I have a misunderstanding ?****

@jmf432x
Copy link
Author

jmf432x commented May 16, 2023

anyone?

@tanghaibao
Copy link

In my VCF, the AF field is in the FORMAT column, so I could get VAF with --retain-fmt AF instead of --retain-info.
Then the VAF column is t_AF in the resulting MAF.

@berguner
Copy link

@tanghaibao AF in the FORMAT column represents the allelic fraction which is the fraction of reads supporting the variant in that sample. @jmf432x is asking about population allele frequency of the variant seen in the population databases like GnomAD or dbSNP.

Back to the original issue, I am also missing GnomAD AF in my MAF file. Would be interested if anyone has figured out the issue and a solution.

@grantn5
Copy link

grantn5 commented Jan 4, 2024

Hi @berguner I have been using the

Nextflow pipeline here (dev branch)

https://github.com/FriederikeHanssen/vcftomaf/tree/dev

To do the conversion it automatically includes all VEP annotations present in the info column

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants