Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple query genes support for SHOOT (with EPA) #7

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

guignonv
Copy link

@guignonv guignonv commented May 26, 2023

This version supports multiple query genes and provides outputs by matching OGs.

There are small changes in the output files:

  • added a ".sh.map" file for correspondance between original query names and processed names (when a ".sh.cleaned" file is created)
  • ".assign.txt" now has an additional column containing the (coma-separated) list of query genes matching the OG (and the corresponding scores are also all there and coma separated)
  • ".fa.sh.msa.fa", ".fa.sh.msa.fa.query.fa", ".sh.msa.fa.ref.fa", ".fa.shoot.tree", ".sh.orthologs.tsv" and ".fa.sh.msa.fa_epa" are now prefixed with their corresponding OG names
  • ".sh.orthologs.tsv" has a new "Query" column added as first column to report the corresponding query gene
  • ".jplace" files include all the genes matching a given OG

Basically, all the query genes are grouped by matching OGs and then reintegrated in each OG in group (and not one by one).

It may need to be tested a bit more extensively by others with more dataset than mines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant