New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem in getting trees by running raxml_sliding_windows.py #31
Comments
Hi Niloo, |
Thanks a lot for your response. Then I will use the ML trees that I got from the IQtree. Because in that approach, I used the unphased data. With the best |
Just to add an opinion on this. I assume you mean that you have taken each diploid individual as a single tip in the tree? Using this approach is not ideal, because actually each haploid allele inside a diploid cell has an independent evolutionary history, and could belong in an entirely different part of the genealogy. When forcing ML software to consider a diploid individual as a tip, you are violating an assumption of the tree inference, which is that each tip has a single evolutionary history. Instead, the ML inference will be trying to force two different histories into one for each individual. |
Thanks, Simon, for letting me know your opinion. Then you think it's better just to use the vcf files including SNPs and then phase it. am I right? |
Hi Niloo, sorry for never responding to this. Yes, my recommendation is to first phase before making trees, if possible. |
Hi Simon
I want to run TWISST for my dataset. I followed your tutorial to get the trees as input for TWISST.
I have a diploid genotype and my data is not suitable for phasing, then I used your script to produce the .geno file. And then, I used raxml_sliding_windows.py to get the window trees. It ran without any error and also gave me the .data.tsv file but the .trees.gz file was included only NA. I also ran this script for the example file (msms_4of10_l50k_r500_sweep.seqgen.SNP.geno.gz), but the same happens.
I used python2 and this command:
python raxml_sliding_windows.py -g msms_4of10_l50k_r500_sweep.seqgen.SNP.geno.gz --prefix out.50 --windType sites -w 50 --model HKY85
Do you have any idea why it happened?
I also have another question: Is it possible to use the ML trees I got by using sequences (including both variant and invariant sites) in IQTREE2 as input for TWISST?
With the best and thanks
Niloo
The text was updated successfully, but these errors were encountered: