Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCF output for xtea_long? #29

Open
eyalmpeer opened this issue Dec 15, 2021 · 6 comments
Open

VCF output for xtea_long? #29

eyalmpeer opened this issue Dec 15, 2021 · 6 comments
Labels
enhancement New feature or request long reads

Comments

@eyalmpeer
Copy link

Hello and thank you for sharing this tool.
I ran xtea_long, per the instructions on the xtea_long branch, on Pac Bio data.
The final output was only txt files.
Is it possible to generate the VCF files mentioned in the article that can aid in determining the zygosity of the insertions?
Or any way to extract from the xtea_long output how many reads in the insertion location support the insertion and how many reads do not support it?
Thanks.

@simoncchu
Copy link
Collaborator

Yeah, this is in my to-do-list. I'll export a vcf file format. For the current output, each column representation could be find here: https://github.com/parklab/xTea_paper/tree/main/run_tools/xTea/HG002.
There is a intermediate file called candidate_list_from_clip.txt has the number of clipped reads (third column), but I didn't count the mapped...

@simoncchu simoncchu added enhancement New feature or request long reads labels Dec 15, 2021
@xzhuo
Copy link

xzhuo commented Jun 23, 2022

Thanks for the pipeline, it is a very useful tool.

Follow up on the xtea_long output: why do the SVA insertion positions often start from a negative value?
Like the 2nd line here: https://github.com/parklab/xTea_paper/blob/main/run_tools/xTea/HG002/HG002_hg38_Nanopore_xTea_SVA.txt

chr6 138775846 SVA None -1411:1274:+ None

Thank you very much!

@simoncchu
Copy link
Collaborator

negative value indicates the insertion is started/ended from the flanking region (likely to be transduction). But maybe also the reported annotation is incorrect.

@xzhuo
Copy link

xzhuo commented Jun 23, 2022

Thanks for your swift reply! Are there ways to infer the correct consensus position for SVA?

@simoncchu
Copy link
Collaborator

It's not straightforward as the reference SVA annotation is fragmented and inaccurate (because of the tandem repeats expansion). For a simple way, just consider position 0 as the start position on the consensus, but it may be inaccurate.

@evayfang2019
Copy link

Thank you, a useful tool for analyzing TEs.
Can xtea_long now generate vcf file directly? I find my output is still classified_results*.txt. I don't know if I made a mistake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request long reads
Projects
None yet
Development

No branches or pull requests

4 participants