-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update microorganisms
data set to latest taxonomy
#135
Comments
Remember to add the taxons mentioned in #131 |
hopefully will be fixed by an update to microorganism data set but can't seem to match Clavispora (Candida) lusitaniae. Also hoping that Nakaseomyces glabratus and Pichia kudriavzevii will arrive with update and their old names redirected to these new names, as lab are moving over to these new names. |
That’s true, it’s a discussion at our lab as well. Do you know if it’s formal already? Meaning, accepted by authoritative taxonomy sources? I’m well aware of the publications suggesting these changes, but do you know whether they are formally adopted already? |
Afraid I don't know for sure about the International Taxonomy groups definitely accepting all these changes. This is a useful paper detailing the most common changes from a medical perspective if you've not already seen it https://doi.org/10.1093/ofid/ofac559 It may well be a moving target unfortunately... |
Yes, I know that one. Unfortunately, Open Forum Infectious Diseases is ‘just’ a journal, not a taxonomically reliable source. The results/propositions of such papers must be ratified by taxonomic sources first. But I found on MycoBank, a great and reliable taxonomic source for fungi, that they do have new names for many Candida species already. I found a couple of inconsistencies though, I’ll share them here and hope that we both could have a look at it. Will be later this week probably. |
From #144, lookup these:
Perhaps they are in MycoBank? |
So I've had a further look at this. I've taken the OFID paper as a starting point. I initially compared new and old OFID names to Mycobank downloadable dataset but then discovered I've produced a reprex which may hopefully help with deciding which toxonomic names to go with. It's not perfect but hopefully can help. In an ideal world I would probably use all the names within a biotech company's MALDI-TOF database and cross reference taxon name to GBIF status. I suspect biotech are never going to give that out kind of information and it is probably not allowed as part of licence when using their software pull that kind of info from the instrument, even if it is possible, due to it being commercially sensitive. It's probably inevitable biotech companies will progress to using "new" names and to me it seems reasonable to use GBIF as the reference standard, even if the GBIF accepted taxonomic name isn't what we use most often in human/vet medicine, as long we can convert all the synonyms to this reference taxonomix name. I've corrected some spelling mistakes within the OFID paper names. Some of the old OFID names are the accepted taxonomic name in GBIF but do not have a species column when GBIF is interograted and therefore return library(tidyverse)
library(rvest)
library(rgbif)
# print all tbl rows
options(pillar.print_max = Inf)
#download and extract OFID paper tables
url <- "https://academic.oup.com/ofid/article/10/1/ofac559/6974385"
tables <- url %>%
read_html() %>%
html_table() %>%
.[seq(1, 24, 4)]# for some reason it is extracting each table 4 times
# cleean OFID tables up
clean_ofid_tlbs <- function(x) {
janitor::clean_names(x) %>%
mutate(current_name = str_replace_all(current_name, "([a-z])(?=[A-Z][^A-Z]+)", "\\1 ")) %>%
separate_longer_delim(current_name, delim = regex("\\s(?=[A-Z])")) %>%
separate_longer_delim(previous_name_s, delim = regex("\\s(?=var )")) %>%
separate_longer_delim(previous_name_s, delim = ",") %>%
mutate(
current_name = str_replace_all(
current_name,
c("Nakaseomyces bracarensisa" = "Nakaseomyces bracarensis",
"Nakaseomyces glabrataa" = "Nakaseomyces glabratus",
"Nakaseomyces nivariensisa" = "Nakaseomyces nivariensis",
"Paracoccidioides restrepoanaa" = "Paracoccidioides restrepoana",
"Talaromyces marneffeib" = "Talaromyces marneffei",
"Moesziomyces antarticus" = "Moesziomyces antarcticus",
"Apiotricum domesticum" = "Apiotrichum domesticum",
"Trematospheria grisea" = "Trematosphaeria grisea",
"Rhizopus arrhizus var delemar" = "Rhizopus arrhizus var. delemar"
)
),
current_name = str_remove(current_name, "\\(varieties no longer recognized\\)"),
current_name = case_when(current_name == "" ~ NA,
.default = current_name),
previous_name_s = str_replace_all(
previous_name_s,
c("var interdigitale" = "Trichophyton mentagrophytes var. interdigitale",
"var mentagrophytes" = "Trichophyton mentagrophytes var. mentagrophytes",
"genotype VIII" = "Trichophyton mentagrophytes genotype VIII",
"var chinensis" = "Rhizopus microsporus var. chinensis",
"var oligosporus" = "Rhizopus microsporus var. oligosporus",
"var rhizopodiformis" = "Rhizopus microsporus var. rhizopodiformis"
)
)
) %>%
drop_na() %>%
rename(ofid_old = previous_name_s, ofid_new = current_name) %>%
select(ofid_old, ofid_new)
}
ofid_tbls <- map(tables, clean_ofid_tlbs)
# Check names against GBIF backbone dataset to see if they are synomyn or accepted name
check_gbif_synonym <- function(x) {
mutate(x,
ofid_old_GBIF_ = rgbif::name_backbone_checklist(ofid_old)["status"],
GBIF_ = rgbif::name_backbone_checklist(ofid_old)["species"],
.after = ofid_old) %>%
mutate(GBIF_ofid_new_match = case_when(GBIF_$species == ofid_new ~ TRUE,
GBIF_$species != ofid_new ~ FALSE,
GBIF_$species == NA_character_ ~ FALSE,
),
.after = ofid_new)
}
syn_status <- map(ofid_tbls, check_gbif_synonym)
syn_status
#> [[1]]
#> # A tibble: 41 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Candida bra… SYNONYM Nakaseomyces… Nakaseo… TRUE
#> 2 Candida cat… SYNONYM Diutina cate… Diutina… TRUE
#> 3 Candida col… ACCEPTED Candida coll… Torulas… FALSE
#> 4 Candida fab… SYNONYM Cyberlindner… Cyberli… TRUE
#> 5 Candida fam… ACCEPTED <NA> Debaryo… NA
#> 6 Candida gla… SYNONYM Nakaseomyces… Nakaseo… TRUE
#> 7 Candida gui… SYNONYM Meyerozyma g… Meyeroz… TRUE
#> 8 Candida kru… SYNONYM Issatchenkia… Pichia … FALSE
#> 9 Candida kef… SYNONYM Kluyveromyce… Kluyver… TRUE
#> 10 Candida pse… SYNONYM Kluyveromyce… Kluyver… TRUE
#> 11 Candida lip… SYNONYM Yarrowia lip… Yarrowi… TRUE
#> 12 Candida lus… SYNONYM Clavispora l… Clavisp… TRUE
#> 13 Candida niv… SYNONYM Nakaseomyces… Nakaseo… TRUE
#> 14 Candida neo… SYNONYM Diutina neor… Diutina… TRUE
#> 15 Candida nor… SYNONYM Pichia norve… Pichia … TRUE
#> 16 Candida par… SYNONYM Wickerhamiel… Diutina… FALSE
#> 17 Candida pel… SYNONYM Wickerhamomy… Wickerh… TRUE
#> 18 Pichia anom… SYNONYM Wickerhamomy… Wickerh… TRUE
#> 19 Candida pse… SYNONYM Diutina pseu… Diutina… TRUE
#> 20 Candida rug… SYNONYM Diutina rugo… Diutina… TRUE
#> 21 Cryptococcu… SYNONYM Naganishia a… Naganis… TRUE
#> 22 Cryptococcu… SYNONYM Cutaneotrich… Cutaneo… FALSE
#> 23 Cryptococcu… SYNONYM Cutaneotrich… Cutaneo… TRUE
#> 24 Cryptococcu… SYNONYM Papiliotrema… Papilio… TRUE
#> 25 Pseudozyma … SYNONYM Moesziomyces… Moeszio… TRUE
#> 26 Pseudozyma … SYNONYM Moesziomyces… Moeszio… TRUE
#> 27 Pseudozyma … SYNONYM Dirkmeia chu… Dirkmei… TRUE
#> 28 Pseudozyma … ACCEPTED <NA> Triodio… NA
#> 29 Pseudozyma … SYNONYM Moesziomyces… Moeszio… TRUE
#> 30 Pseudozyma … ACCEPTED <NA> Ustilag… NA
#> 31 Geotrichum … SYNONYM Saprochaete … Magnusi… FALSE
#> 32 Geotrichum … SYNONYM Magnusiomyce… Magnusi… TRUE
#> 33 Saprochaete… SYNONYM Magnusiomyce… Magnusi… TRUE
#> 34 Pichia ohme… SYNONYM Kodamaea ohm… Kodamae… TRUE
#> 35 Trichosporo… SYNONYM Cutaneotrich… Cutaneo… TRUE
#> 36 Trichosporo… SYNONYM Cutaneotrich… Cutaneo… TRUE
#> 37 Trichosporo… SYNONYM Apiotrichum … Apiotri… TRUE
#> 38 Trichosporo… SYNONYM Apiotrichum … Apiotri… TRUE
#> 39 Trichosporo… SYNONYM Cutaneotrich… Cutaneo… TRUE
#> 40 Trichosporo… SYNONYM Apiotrichum … Apiotri… TRUE
#> 41 Trichosporo… ACCEPTED Trichosporon… Apiotri… FALSE
#>
#> [[2]]
#> # A tibble: 37 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Acremonium … SYNONYM Sarocladium … Sarocla… TRUE
#> 2 Acremonium … SYNONYM Gliomastix r… Gliomas… TRUE
#> 3 Acremonium … SYNONYM Sarocladium … Sarocla… TRUE
#> 4 Arthroderma… SYNONYM Trichophyton… Trichop… TRUE
#> 5 Cerinosteru… SYNONYM Quambalaria … Quambal… TRUE
#> 6 Sporothrix … SYNONYM Quambalaria … Quambal… TRUE
#> 7 Fusarium di… SYNONYM Bisifusarium… Bisifus… TRUE
#> 8 Fusarium fa… ACCEPTED Fusarium fal… Neocosm… FALSE
#> 9 Acremonium … SYNONYM Fusarium fal… Neocosm… FALSE
#> 10 Fusarium ke… ACCEPTED Fusarium ker… Neocosm… FALSE
#> 11 Fusarium li… ACCEPTED Fusarium lic… Neocosm… FALSE
#> 12 Fusarium pe… ACCEPTED Fusarium pet… Neocosm… FALSE
#> 13 Fusarium so… ACCEPTED Fusarium sol… Neocosm… FALSE
#> 14 Geosmithia … SYNONYM Rasamsonia a… Rasamso… TRUE
#> 15 Penicillium… SYNONYM Rasamsonia a… Rasamso… TRUE
#> 16 Gibberella … ACCEPTED Gibberella f… Fusariu… FALSE
#> 17 Lecythophor… SYNONYM Coniochaeta … Conioch… TRUE
#> 18 Phialophora… SYNONYM Coniochaeta … Conioch… TRUE
#> 19 Microsporum… SYNONYM Paraphyton c… Paraphy… TRUE
#> 20 Microsporum… ACCEPTED Microsporum … Nannizz… FALSE
#> 21 Microsporum… ACCEPTED Microsporum … Lophoph… FALSE
#> 22 Microsporum… ACCEPTED Microsporum … Nannizz… FALSE
#> 23 Microsporum… ACCEPTED Microsporum … Nannizz… FALSE
#> 24 Microsporum… SYNONYM Trichophyton… Nannizz… FALSE
#> 25 Neosartorya… SYNONYM Aspergillus … Aspergi… TRUE
#> 26 Neosartorya… SYNONYM Aspergillus … Aspergi… TRUE
#> 27 Aspergillus… SYNONYM Aspergillus … Aspergi… TRUE
#> 28 Neosartorya… SYNONYM Aspergillus … Aspergi… TRUE
#> 29 Paecilomyce… SYNONYM Purpureocill… Purpure… TRUE
#> 30 Paecilomyce… SYNONYM Marquandomyc… Marquan… TRUE
#> 31 Penicillium… SYNONYM Talaromyces … Talarom… TRUE
#> 32 Penicillium… SYNONYM Talaromyces … Talarom… TRUE
#> 33 Trichophyto… SYNONYM Arthroderma … Arthrod… TRUE
#> 34 Trichophyto… ACCEPTED Trichophyton… Arthrod… FALSE
#> 35 Trichophyto… SYNONYM Trichophyton… Trichop… FALSE
#> 36 Trichophyto… SYNONYM Trichophyton… Trichop… TRUE
#> 37 Trichophyto… ACCEPTED Trichophyton… Trichop… FALSE
#>
#> [[3]]
#> # A tibble: 27 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Emmonsia cr… ACCEPTED <NA> Emergom… NA
#> 2 Emmonsia he… SYNONYM Blastomyces … Blastom… TRUE
#> 3 Emmonsia pa… SYNONYM Blastomyces … Blastom… TRUE
#> 4 Emmonsia so… ACCEPTED Emmonsia soli Emergom… FALSE
#> 5 Emmonsia “s… ACCEPTED <NA> Blastom… NA
#> 6 Emmonsia “s… ACCEPTED <NA> Emergom… NA
#> 7 Emmonsia pa… SYNONYM Emergomyces … Emergom… TRUE
#> 8 Histoplasma… ACCEPTED <NA> Histopl… NA
#> 9 Histoplasma… ACCEPTED <NA> Histopl… NA
#> 10 Histoplasma… ACCEPTED <NA> Histopl… NA
#> 11 Histoplasma… ACCEPTED <NA> Histopl… NA
#> 12 Lacazia lob… SYNONYM Paracoccidio… Paracoc… TRUE
#> 13 Paracoccidi… ACCEPTED Paracoccidio… Paracoc… FALSE
#> 14 Paracoccidi… ACCEPTED Paracoccidio… Paracoc… FALSE
#> 15 Paracoccidi… ACCEPTED Paracoccidio… Paracoc… FALSE
#> 16 Paracoccidi… ACCEPTED Paracoccidio… Paracoc… FALSE
#> 17 Paracoccidi… ACCEPTED Paracoccidio… Paracoc… FALSE
#> 18 Penicillium… SYNONYM Talaromyces … Talarom… TRUE
#> 19 Sporothrix … ACCEPTED Sporothrix s… Sporoth… FALSE
#> 20 Sporothrix … ACCEPTED Sporothrix s… Sporoth… FALSE
#> 21 Sporothrix … ACCEPTED Sporothrix s… Sporoth… FALSE
#> 22 Sporothrix … ACCEPTED Sporothrix s… Sporoth… FALSE
#> 23 Sporothrix … ACCEPTED Sporothrix p… Sporoth… FALSE
#> 24 Sporothrix … ACCEPTED Sporothrix p… Sporoth… FALSE
#> 25 Sporothrix … ACCEPTED Sporothrix p… Sporoth… FALSE
#> 26 Sporothrix … ACCEPTED Sporothrix p… Sporoth… FALSE
#> 27 Sporothrix … ACCEPTED Sporothrix p… Sporoth… FALSE
#>
#> [[4]]
#> # A tibble: 9 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Bipolaris au… SYNONYM Curvularia a… Curvula… TRUE
#> 2 Bipolaris ha… SYNONYM Curvularia h… Curvula… TRUE
#> 3 Bipolaris sp… SYNONYM Curvularia s… Curvula… TRUE
#> 4 Ochroconis g… SYNONYM Verruconis g… Verruco… TRUE
#> 5 Phialophora … SYNONYM Pleurostoma … Pleuros… TRUE
#> 6 Pseudallesch… ACCEPTED Pseudallesch… Scedosp… FALSE
#> 7 Ramichloridi… SYNONYM Rhinocladiel… Rhinocl… TRUE
#> 8 Ramichloridi… SYNONYM Myrmecridium… Myrmecr… TRUE
#> 9 Scedosporium… ACCEPTED Scedosporium… Lomento… FALSE
#>
#> [[5]]
#> # A tibble: 8 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Leptosphaeri… SYNONYM Falciformisp… Falcifo… TRUE
#> 2 Leptosphaeri… SYNONYM Falciformisp… Falcifo… TRUE
#> 3 Scytalidium … SYNONYM Neoscytalidi… Neoscyt… TRUE
#> 4 Scytalidium … SYNONYM Neoscytalidi… Neoscyt… TRUE
#> 5 Hendersonula… SYNONYM Neoscytalidi… Nattras… FALSE
#> 6 Pyrenochaeta… SYNONYM Medicopsis r… Medicop… TRUE
#> 7 Pyrenochaeta… SYNONYM Nigrograna m… Nigrogr… TRUE
#> 8 Madurella gr… SYNONYM Trematosphae… Tremato… TRUE
#>
#> [[6]]
#> # A tibble: 13 × 5
#> ofid_old ofid_old_GBIF_$status GBIF_$species ofid_new GBIF_ofid_new_match
#> <chr> <chr> <chr> <chr> <lgl>
#> 1 Absidia cor… SYNONYM Lichtheimia … Lichthe… TRUE
#> 2 Mycocladus … SYNONYM Lichtheimia … Lichthe… TRUE
#> 3 Rhizopus az… SYNONYM Rhizopus mic… Rhizopu… TRUE
#> 4 Rhizopus de… SYNONYM Rhizopus arr… Rhizopu… FALSE
#> 5 Rhizopus mi… ACCEPTED Rhizopus mic… Rhizopu… TRUE
#> 6 Rhizopus mi… SYNONYM Rhizopus mic… Rhizopu… TRUE
#> 7 Rhizopus mi… SYNONYM Rhizopus mic… Rhizopu… TRUE
#> 8 Rhizopus mi… SYNONYM Rhizopus mic… Rhizopu… TRUE
#> 9 Rhizopus or… SYNONYM Rhizopus arr… Rhizopu… TRUE
#> 10 Rhizomucor … SYNONYM Mucor irregu… Mucor i… TRUE
#> 11 Saksenaea v… ACCEPTED Saksenaea va… Saksena… FALSE
#> 12 Saksenaea v… ACCEPTED Saksenaea va… Saksena… FALSE
#> 13 Saksenaea v… ACCEPTED Saksenaea va… Saksena… FALSE Created on 2024-05-05 with reprex v2.1.0 |
In addtion to the above I have identified some other issues which came up when using AMR on some data. The following were the problems I identified from the output of
Although LPSN lists M. bovis taxonomically as M. tb, UK Mycobacterial reference labs still refer to it as M. bovis -- this has clinical implications as M. bovis is intrinsically resistant to pyrazinamide and therefore requires a longer treatment duration than M. tb standard short course therapy. Salmonellae are always difficult but I thought Enteritidis was a serovar and so should be Salmonella enterica Enteritidis. This is the current WHO acrredited list for Salmonella serovars in case you don't have it https://www.pasteur.fr/sites/default/files/veng_0.pdf
|
Thanks for the great work up! GBIF is not very up to date with bacterial taxonomy - if they release in November (which they do annually) then there are still hundreds of outdated species according to LPSN that strictly follows IJSEM publications. But the I’ll look deeper into what you mentioned here, great to have this as a reference, so many thanks! |
No description provided.
The text was updated successfully, but these errors were encountered: