Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building a taxonomy for use in phyndr #12

Open
uyedaj opened this issue May 12, 2015 · 1 comment
Open

Building a taxonomy for use in phyndr #12

uyedaj opened this issue May 12, 2015 · 1 comment

Comments

@uyedaj
Copy link

uyedaj commented May 12, 2015

I'm wondering what the best way to build a taxonomy for use in phyndr. I can retrieve for each species in the tree a lineage from domain to species, but they will be extremely variable in the number of named taxa between "domain" and "species" . What's the best way to assemble these lineages into a table that phyndr can use? Can I just make a new column for each unique taxon, as long as everything to the right is either completely mutually exclusive, or a more inclusive taxon? This means that each column is either blank or a member of that clade, with only one taxon per column. Here's a simple example of what I mean:

Species Genus rank1 rank2 rank3 rank4 rank5
Homo sapiens Homo Primates Mammalia Bilateria
Pan paniscus Pan Primates Mammalia Bilateria
Drosophila hydei Drosophila Diptera Hexapoda Bilateria

Here are two example species that illustrate the problem:


$`Malawimonas californiana`
    ottid    rank unique_name       ottTaxonName node_id
1  935422   genus                    Malawimonas 4500653
2 2927065   genus                       Excavata 4500652
3  304358  domain                      Eukaryota  490816
4   93302 no rank             cellular organisms    8295
5  805080 no rank                           life       2

$`Drosophila affinidisjuncta`
     ottid             rank                                 unique_name        ottTaxonName node_id
1   913110 species subgroup                                              grimshawi subgroup 1839837
2   667385    species group                                                 grimshawi group 1839821
3    91810          no rank                                                 grimshawi clade 1839781
4   295685          no rank                                              picture wing clade 1839708
5   688699          no rank                                             Hawaiian Drosophila 1839649
6    34907            genus          Drosophila (genus in Drosophiliti)          Drosophila 1838729
7  1069157          no rank                                                    Drosophiliti 1837343
8   251315         subtribe                                                    Drosophilina 1837342
9   396616            tribe                                                    Drosophilini 1837304
10  127994        subfamily                                                   Drosophilinae 1837303
11   34905           family                                                   Drosophilidae 1836105
12  758897      superfamily                                                     Ephydroidea 1832759
13  758894          no rank                                                    Acalyptratae 1796319
14  133926            genus                                                     Schizophora 1746881
15  951146          no rank                                                    Cyclorrhapha 1746880
16  951142          no rank                                                      Eremoneura 1746879
17  133916       infraorder                                                     Muscomorpha 1722052
18  555807         suborder                                                      Brachycera 1721715
19  661378            order Diptera (order in infraclass Endopterygota)             Diptera 1721713
20 1082885       infraclass                                                   Endopterygota 1347193
21  815350         subclass           Neoptera (subclass in Dicondylia)            Neoptera 1347192
22 1048707          no rank                   Pterygota (in Dicondylia)           Pterygota 1347190
23  983656          no rank                                                      Dicondylia 1346005
24 1062253            class                                                         Insecta 1346001
25  568991       superclass                                                        Hexapoda 1336931
26  985906          no rank                                                    Pancrustacea 1212297
27  985907          no rank                                                     Mandibulata 1212295
28  632179           phylum                                                      Arthropoda 1044777
29  816442          no rank                                                   Panarthropoda 1044776
30  611099          no rank                                                       Ecdysozoa 1043940
31  189832          no rank                                                     Protostomia 1043939
32  117569          no rank                                                       Bilateria 1043938
33  641038          no rank                                                       Eumetazoa 1043937
34  691846          kingdom                                                         Metazoa 1018983
35 5246131          no rank            Holozoa (in phylum Opisthokonta)             Holozoa 1018971
36  332573           phylum                                                    Opisthokonta  495708
37  304358           domain                                                       Eukaryota  490816
38   93302          no rank                                              cellular organisms    8295
39  805080          no rank                                                            life       2
@wcornwell
Copy link
Collaborator

I think Eastman solved this problem to help build the Zanne et al. tree. His taxonomy table for plants is now at:

devtools::install_github("richfitz/storr")
devtools::install_github("wcornwell/TaxonLookup")
library(TaxonLookup)
head(add_higher_order())

But not sure where the code ended up for turning the table into a phylogeny. Maybe @mwpennell or Tank knows?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants