Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the branch lenghth in species tree are necessary in my analysis using hyphy? #1690

Closed
Neo-xbx-00 opened this issue Feb 23, 2024 · 4 comments

Comments

@Neo-xbx-00
Copy link

Dear hyphy users and developers,
I found it really convenient for using this tool to identity selections among species. Here, I face an issue and hope you to help me out. Now, I have only three species:
(1) I use Orthofinder software to get gene orthologs among these species, which include orthologs containing the single-copy genes in these species.
(2) Then, I choose the above single-copy orthologs and conduct analysis using hyphy. However, as I only have three species, I can not have branch length in my species tree, which looks like this, all branch lenth have been set to 1:
((Emal:1,Erad:1):1,Lbre:1):1;
So, I wonder if the species tree lack branch length will have negative influence on hyphy analysis and models?
Best regards!

@spond
Copy link
Member

spond commented Feb 23, 2024

Dear @Neo-xbx-00,

If you provide branch lengths, HyPhy will use them as initial guesses. If you don't, HyPhy will makes its own guesses. In any case, these values are re-optimizied in the vast majority of analyses. For example (see the attached .zip for data files). It is better to provide a tree with no lengths than bad lengths.

  1. Tree with reasonable lengths.
hyphy busted --alignment CD2.phylip --tree CD2-tree-with-lengths.nwk --starting-points 5
...

* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.

  1. Tree without lengths.
 hyphy busted --alignment CD2.phylip --tree CD2-tree-without-lengths.nwk --starting-points 5
...
* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.
  1. Tree with really wrong branch lengths (1000 each)
 hyphy busted --alignment CD2.phylip --tree CD2-tree-with-crazy-lengths.nwk --starting-points 5
...
* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.

Best,
Sergei

CD2.zip

@Neo-xbx-00
Copy link
Author

Just make some supplyment:
In my case, I test seven models; and I would like you to help me identify potential errors in the code:
For FUBAR model:
The code is: cat ../good_HOG.list | while read i; do hyphy fubar CPU=8 --alignment ../../2_pal2nal/pal2nal_results/${i}.pal2nal.fa --tree ./Litt_tree_all.txt --output ${i}.fubar.json > ${i}.fubar.log 2>&1; done
The tree is: ((Emal:1,Erad:1):1,Lbre:1):1;

For FEL model:
The code is: cat ../good_HOG.list | while read i; do hyphy fel CPU=8 --pvalue 0.05 --alignment ../../2_pal2nal/pal2nal_results/${i}.pal2nal.fa --tree ./Litt_tree_all.txt --output ${i}.fel.json > ${i}.fel.log 2>&1; done
The tree is: ((Emal:1,Erad:1):1,Lbre:1):1;

For MEME model:
The code is: cat ../good_HOG.list | while read i; do hyphy meme CPU=8 --pvalue 0.05 --branches test --alignment ../../2_pal2nal/pal2nal_results/${i}.pal2nal.fa --tree ../Litt_tree_E_foreground.txt --output ${i}.meme.json > ${i}.meme.log 2>&1; done
The tree is: ((Emal{test}:1,Erad{test}:1):1,Lbre:1):1;

For aBSREL model:
The code is : cat ../good_HOG.list | while read i; do hyphy absrel CPU=8 --branches test --alignment ../../2_pal2nal/pal2nal_results/${i}.pal2nal.fa --tree ../Litt_tree_Echino_foreground.txt --output ${i}.absrel.json > ${i}.absrel.log 2>&1; done
The tree is: ((Emal{test}:1,Erad{test}:1):1,Lbre:1):1;

And For Relax model:
The code is: cat ../good_HOG.list | while read i; do hyphy relax CPU=8 --test T --reference R --alignment ../../2_pal2nal/pal2nal_results/${i}.pal2nal.fa --tree ./Litt_tree_relax_model.txt --output ${i}.relax.json > ${i}.relax.log 2>&1; done
The tree is: ((Emal{T}:1,Erad{T}:1):1,Lbre{R}:1):1;

I'm worry about if I have some wrong understanding of hyphy analysis. Many Thanks !

@Neo-xbx-00
Copy link
Author

Dear @Neo-xbx-00,

If you provide branch lengths, HyPhy will use them as initial guesses. If you don't, HyPhy will makes its own guesses. In any case, these values are re-optimizied in the vast majority of analyses. For example (see the attached .zip for data files). It is better to provide a tree with no lengths than bad lengths.

  1. Tree with reasonable lengths.
hyphy busted --alignment CD2.phylip --tree CD2-tree-with-lengths.nwk --starting-points 5
...

* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.
  1. Tree without lengths.
 hyphy busted --alignment CD2.phylip --tree CD2-tree-without-lengths.nwk --starting-points 5
...
* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.
  1. Tree with really wrong branch lengths (1000 each)
 hyphy busted --alignment CD2.phylip --tree CD2-tree-with-crazy-lengths.nwk --starting-points 5
...
* Log(L) = -3414.47, AIC-c =  6910.74 (40 estimated parameters)
....
Likelihood ratio test for episodic diversifying positive selection, **p =   0.0031**.

Best, Sergei

CD2.zip

Well~ Many Thanks for your in time reply! I get it now.

Copy link

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants