Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phased SV calling #477

Open
annerilotter opened this issue Apr 22, 2024 · 4 comments
Open

Phased SV calling #477

annerilotter opened this issue Apr 22, 2024 · 4 comments
Labels

Comments

@annerilotter
Copy link

Hi, first I want to say this is a very nice tool.

My question is in regards to calling phased variants. I currently have phased assemblies, and have called SVs with SyRI. I would like to use this read alignment method as an independent validation of the SVs detected with another tool similar to what was done in this paper: https://www.sciencedirect.com/science/article/pii/S1674205224000820?via%3Dihub#sec3

I currently do not have a phased bam as we used HifiAsm+Hi-C for phased assembly. Can I assume that haplotype-specific variants will only have half the reads spanning that specific breakpoint or would it be better to do a read to assembly based alignment method to generate a phased bam?

Any help would be appreciated,

Kind regards

@fritzsedlazeck
Copy link
Owner

Dear @annerilotter
this is indeed a bit tricky i think.. So you want to produce a phased bam file that follows the phasign of the assebly right ?
You could use dipcall to call variants of your phased assembly vs. the reference genome. Then take these phased SNV and try to tag the reads in a mapped bam file (e..g whatshapp) and then give the so phased bam file to Sniffles.
I am honestly not sure if this will work as it requires Whatshapp to accept the phased SNV from Dipcall and that the mapped bam file corresposods (which hopefully will be the case).
Nevertheless, this is how I would try this out..
hope that helps
Fritz

@annerilotter
Copy link
Author

Dear @fritzsedlazeck , thank you for the clarification. I will try a couple of things. I just basically need to validate that the SV breakpoints exist in at least half the reads. I think the current Whatshap strategy seems a bit circular.

Kind regards

@fritzsedlazeck
Copy link
Owner

Wait you just want to know if the SV exist in half of the reads ? Take samplot or IGV directly .. without phasing.

Whatshap is needed to tag the bam file and then report the phasing to Sniffles. If you dont do it in this way the hap 1 or 2 assignment will likely be different to your assembly results because hap1 and 2 are assigned randomly .. Its just within the phaseblock you can rely on them

@annerilotter
Copy link
Author

Hi @fritzsedlazeck , that may work for a few SVs but not thousands. Anycase, I see that sniffles picked up the variant I was looking to validate but the breakpoints are not exactly the same even though the event is in the same region more or less (it is a large inversion). I wanted to use it to get a recall rate between the two methods (SyRI and Sniffles) as a manner to independently validate SVs identified by SyRI. I hope that makes sense. I guess a bed intersect would do, or something like SURVIVOR? I would probably have to accept that Sniffles has an SV and not consider the SV type (i.e. presence-absence of variant rather than the type).

Thank you very much for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants