Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Sniffles2 detect SV variants which have no spanning reads covering the whole variant? #459

Open
vyx-lucy-kaplun opened this issue Mar 1, 2024 · 11 comments

Comments

@vyx-lucy-kaplun
Copy link

I am failing to detect a long tandem expansion expansion in STR region.

All other tandem repeat expansions are detected as expected but they are much shorter and covered by at least some spanning reads with anchors at both ends, while this one is too long and all reads end up as soft clipped, leaving target region completely uncovered.
I am using single sample calling mode, with --minsupport 2 and --tandem-repeats with appropriate bed file, and getting no relevant SV in the region. I have also tried to do that without tandem repeat option, and it did not change the output in this region. I have also tried --minsupport auto and --mosaic, and still got nothing.
Is there a way to detect this variant with Sniffles2?

image

@fritzsedlazeck
Copy link
Owner

So Sniffles is not a tandem repeat caller. What I woudl expect is that you might be able to call these with lowering the mapping quality. Furthermore, the mimimum size of SV to be reported is set to 50bp! I dont know if this deletion that you show here is longer than 50bp..
Thanks
Fritz

@vyx-lucy-kaplun
Copy link
Author

@fritzsedlazeck
Thank you!
Yes, I realize that it is not intended for that purpose. I am calling tandem repeat expansions as duplications or insertions and processing them further. This variant is much, much longer than 50bp, in fact >2000bp.

@fritzsedlazeck
Copy link
Owner

I see. Maybe try by lowering the mapping quality.. but it also depends on the read length here.
Thanks
Fritz

@vyx-lucy-kaplun
Copy link
Author

@fritzsedlazeck
Thank you. I will try that

@fritzsedlazeck
Copy link
Owner

Did you had any luck?
Thanks
Fritz

@vyx-lucy-kaplun
Copy link
Author

@fritzsedlazeck
I am trying now minimap2 with ava-ont option. Will let here know as some as I see any results, but I am reluctant to relax the mapping settings too much as my reads are ONT so not great to begin with.

I tried with Sniffles with --no-qc option and got an actual output for the variant in question, but unfortunately it generates variant type 'BND' without variant consensus sequence. It is not surprising, of cause: no full variant sequence covered by the read -no consensus.

@fritzsedlazeck
Copy link
Owner

Yeah if its a BND that means part of reads map to a different location.. Maybe your expansion is too large... mmh. Have you considered a tandem repeat caller? Coming back to this after a while, I dont know if this is ONT or Pacbio data...? Dependent on this you could try Medaka /STRGLR (ONT) or TRGT (Pacbio).

@vyx-lucy-kaplun
Copy link
Author

@fritzsedlazeck Yes, I tried tandem repeat caller. It is ONT, I tried strglr, and it yielded nothing. Indeed, you are right, the variant is very long. I tried different variant callers as a last resort, to see whether I might be able to detect it in some way after strglr came up with blank.

@fritzsedlazeck
Copy link
Owner

One thing I had some luck with was medaka model. We just got this paper accepted: https://www.biorxiv.org/content/10.1101/2023.10.29.564632v1.
Here they used medaka and was actually quite good. I dont know if they overtrained at HG002..

@vyx-lucy-kaplun
Copy link
Author

@fritzsedlazeck Thank you. I am afraid they are also not doing too well with very long variants.

@fritzsedlazeck
Copy link
Owner

mmh what if you assemble it ? e..g Flye ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants