-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hetcount ignores ALT-only sites (such as 1/2) #313
Comments
Please provide a patch for the docs and/or the tool. That helps everyone. |
I've tried to look at the code, but I don't know C++ that well so I haven't figured out what needs changing, sorry. I'm not sure if fixing the documentation makes sense at this point, I think it's better to fix the issue first and then properly describe what the tool does afterwards. |
OK. Thanks for trying! Maybe someone will get round to it. |
Yeah, sorry, last time I got lost while looking at the files referenced in the source for this tool. I'm looking at the code again though and I noticed this bit inside a Lines 59 to 65 in a549707
As I understand the condition at line 63, it should count all alleles that are not 0 (i.e. not REF). So I don't see how it could get it wrong. But then there is this bit just under it: Lines 66 to 69 in a549707
Could it be that the count is saved in a wrong variable (note the comment at line 68) or something like that? I'm not sure how the function works exactly, there is a lot of |
Describe the bug
The
vcfhetcount
tool ignores heterozygous sites with just ALT alleles as genotype (such as 1/2, 1/3, 2/3 and so on).Based on the wording in documantation I would actually expect these to be counted twice:
But I think the tool is meant to rather just count heterozygous sites, no matter how many ALT alleles they hold. However at the moment it seems to require REF to be one of the alleles in genotype (e.g. 0/1, 0/2 etc).
To Reproduce
Take any VCF with multi-allelic sites, bi-allelic won't do. Run
vcfhetcount
on it.Then you can compare your result with this example command, which counts possible GT combinations for the first sample (
$10
):Expected behavior
I would expect the
vcfhetcount
tool to count all heterozygous sites in a sample, including ALT-only sites like 1/2. And they should be counted once, even if they contain two ALT alleles. Consider also rewording its description in the documentation, please.Additional context
I suspect that the
vcfhethomratio
tool may have the same limitation, however it's much less straightforward to test, especially since I'm still only guessing what is it actually reporting.The text was updated successfully, but these errors were encountered: