Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a cohort / multisample VCF #20

Open
kvn95ss opened this issue Feb 21, 2023 · 4 comments
Open

Creating a cohort / multisample VCF #20

kvn95ss opened this issue Feb 21, 2023 · 4 comments

Comments

@kvn95ss
Copy link

kvn95ss commented Feb 21, 2023

Hello,

We have ~100 samples for which we would like to call mitochondrial variants. The cohort contains a mix of related and unrelated samples. We would like to know the recommended way to call variants for all samples in the cohort.

For ex, a trio was used for test example, so I'm not sure if combining data that way would be feasible for our data.

Or, would it be better to mity call + normalize the variants separately for each sample, merging the vcfs, then using mity report to annotate the variants?

@kvn95ss
Copy link
Author

kvn95ss commented Feb 22, 2023

UPDATE - I tried to run mity call with all the bam files, but got this error -

Calling mitochondrial variants
Running FreeBayes in sensitive mode
No variants in VCF with specified chromosome/s
Writing normalised vcf: cohort_output/all_cohort_chrM.mity.vcf.gz

And the vcf file was empty with no variants. Looks like merging VCFs might be better option :/

@kvn95ss
Copy link
Author

kvn95ss commented Feb 22, 2023

I tried to combine the vcfs with bcftools merge and ran into this error -

mity 0.3.0
Generating mity report
Traceback (most recent call last):
  File "/root/scripts/mity", line 29, in <module>
    args.func(args)
  File "/root/scripts/mitylib/commands.py", line 160, in _cmd_report
    report.do_report(args.vcf, args.prefix, args.min_vaf, args.out_folder_path)
  File "/root/scripts/mitylib/report.py", line 494, in do_report
    min_vaf)
  File "/root/scripts/mitylib/report.py", line 180, in make_table
    if float(FORMAT_VAF) > float(min_vaf):
ValueError: could not convert string to float: '.'

Any leads appreciated, thanks!

@drmjc
Copy link
Member

drmjc commented Mar 3, 2023

Hi,
I've never tried a huge cohort, as we usually ran mity for probands or trios. So my first recommendation is to run the related samples together in batches, with normalize and report for each family.

If the VCF is empty, my first suggestion is to look at which --reference was used and that this matches the MT chromosome name in the BAM, see https://github.com/KCCG/mity#human

The most recent error message looks like a variant in the VCF has '.' for the AD or DP value. This likely arose from running bcftools merge in the middle of the toolchain which is not supported. Thanks for the report, we could add some code that checks for this and at least allows the valid variants to be analysed.

@kvn95ss
Copy link
Author

kvn95ss commented Mar 6, 2023

Hay!

Thanks for the reply. We wanted to have a broad, sweeping look at all mitochondrial variants, so for now I have ran mity for individual samples, then merged the excels using pandas. Admittedly it is hacky-slashy way to go about it and lack of allele counts is a major drawback.

The reference we are using is correct, as for individual samples it works with no issues.

I did use bcftools merge just as a blind shot, but if multi-sample vcfs are supported in report it would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants