Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Viewing Rooted Topologies in R Code #44

Open
jbernst opened this issue Sep 29, 2023 · 3 comments
Open

Viewing Rooted Topologies in R Code #44

jbernst opened this issue Sep 29, 2023 · 3 comments

Comments

@jbernst
Copy link

jbernst commented Sep 29, 2023

Hi Simon,

I am currently using Twisst (which seems like a fantastic program!) to look at the weighted topologies of trees across chromosomes in my study group. Including an outgroup, there are ~6-8 groups (depends on how you view it, but there is definitely hybridization) and 3 ingroup species. For simplicity and for the sake of just getting the code to work, I am running Twisst with 4 groups: Outgroup, C, V, O. I successfully ran Twisst on one chromosome, which included a set of 961 bootstrapped gene trees from RAxML-NG.

Upon running this in the R code for visualization, I noticed that the output trees are unrooted. I know this is expected output from Twisst and in the publication it mentions that your trees from the weights.tsv.gz file are unrooted, but rooted for the sake of visualization in the figure (attached below is our data with the C, V, O, and Outgroup showing unrooted topologies).
Screenshot 2023-09-29 at 3 42 04 PM

I noticed in the downloadable weights.tsv.gz from Github, though, you seem to have rooted trees for looking at the weights. How did you get rooted trees from your analysis? We are pretty certain we know the 'true' topology of these organisms, and it would be helpful to know if there is a way to visualize at least the outgroup as an outgroup in our graphs.
Screenshot 2023-09-29 at 3 42 15 PM

I also am wondering, is the interpretation the same? If you look at the most weighted topology in the image I attached of our own data run, despite it showing an unrooted tree, can I still interpret this as the most abundant topology being (Outgroup,(S,(V,O)))?

Also, I have one other question, which is a bit more basic on how Twisst works. In the image below, it shows a distribution of weights for the three topologies when I have gene trees with 4 groups. How does Twisst calculate multiple numbers for a single position (which would be a gene tree). I ran Twisst on a single gene tree with 4 groups expecting to get a weight distribution of 1,0,0 since there is only one tree topology for a given gene tree (instead I saw numbers that showed multiple topologies for a single gene tree) . But since Twisst works on subtrees, I think I am misinterpreting how the algorithm works and how to interpret the results. How do we get 2-3 numbers as weights for each gene tree provided?

Screenshot 2023-09-29 at 3 50 04 PM

If it helps, here is the code we ran:

python twisst.py -t iqtree.genetrees.tre.gz -w output.weights.csv.gz --outputTopos \ output.topologies.trees \
	 --method complete \
	-g S \
	-g V \
	-g O \
	-g Outgroup \
	--groupsFile group-file.txt

Here is the groupFile we made:

266_og	Outgroup
261_og	Outgroup
263_r	S
262_r	S
267_r	S
269_r	S
263_r	S
267_r	S
268_r	S
266_r	S
268_r	V
268_o	V
268_o	V
261_o	V
265_o	V
266_o	V
266_w	O
265_w	O
263_w	O
265_w	O
261_w	O
260_w	O
267_w	O
263_w	O
261_w	O
269_w	O
266_w	O
266_w	O
267_w	O
264_w	O
263_w	O
264_w	O
262_w	O
266_w	O

Thank you so much! This program is looking really useful for this project, and I am looking forward to understanding it better!

@simonhmartin
Copy link
Owner

Hi Justin, sorry for the delay. If you include --outgroup Outgroup in your twisst command, it should work. Note you still need to include -g Outgroup.

@simonhmartin
Copy link
Owner

Just to add, the outgroup can have any name. For example -g A -g B -g C -g D --outgroup D.

@simonhmartin
Copy link
Owner

I see I never responded to this question "How do we get 2-3 numbers as weights for each gene tree provided?". Do you mean how can the weights be between 0 and 1 for each topology for a single gene tree? If so, you will need to read the original paper to see how the weighting works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants