Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make sniffles do a better job in splitting insertions at the same location into two het calls? #460

Open
ymcki opened this issue Mar 6, 2024 · 1 comment

Comments

@ymcki
Copy link

ymcki commented Mar 6, 2024

I am running sniffles with HG002 data from ONT on NIST CMRG v1.00 benchmark. This benchmark has 216 SVs.

I find that most of the false negatives are due to failure of sniffles to call two different insertions at the same position as two heterozygous calls and simply call them as a homozygous, e.g. this one is called as a homo 304bp insertion
chr3 45890270 Sniffles2.INS.2C7S2 N <INS> 60 PASS IMPRECISE;SVTYPE=INS;SVLEN=304;END=45890270;SUPPORT=43;COVERAGE=43,43,43,43,42;STRAND=+-;AF=1.000;STDEV_LEN=31.472;STDEV_POS=0.000;SUPPORT_LONG=0 GT:GQ:DR:DV 1/1:60:0:43

In the above case, the correct call should be 245bp insertion on one haplotype and 309bp on the other.

I found that when the sizes of insertions differ more significantly, they can be called as two heterozygous calls, e.g.
chr9 137102263 Sniffles2.INS.8DCS8 N <INS> 60 PASS PRECISE;SVTYPE=INS;SVLEN=267;END=137102263;SUPPORT=18;COVERAGE=33,33,33,33,34;STRAND=+-;AF=0.545;STDEV_LEN=0.000;STDEV_POS=0.000;SUPPORT_LONG=0 GT:GQ:DR:DV 0/1:60:15:18
chr9 137102263 Sniffles2.INS.8DDS8 N <INS> 60 PASS PRECISE;SVTYPE=INS;SVLEN=1670;END=137102263;SUPPORT=14;COVERAGE=33,33,33,33,34;STRAND=+-;AF=0.424;STDEV_LEN=2.493;STDEV_POS=9.054;SUPPORT_LONG=0 GT:GQ:DR:DV 0/1:60:19:14

So what parameters can I change to make the latter happens more often? Thanks a lot in advance.

@fritzsedlazeck
Copy link
Owner

Dear @ymcki
we are just releasing a new version. The code version is now alreayd online for a few days.
This should improve the behavior to not over merge INS as the sequence identify is taken into consideration.
thanks
Fritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants