Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider GBIF as (alternative) taxonomic source #458

Open
mfrasca opened this issue Oct 31, 2018 · 10 comments
Open

consider GBIF as (alternative) taxonomic source #458

mfrasca opened this issue Oct 31, 2018 · 10 comments
Labels
taxonomy taxonomy-related

Comments

@mfrasca
Copy link
Member

mfrasca commented Oct 31, 2018

@tmyersdn writes in #440:

I currently prefer GBIF as I find it is gives better information about synonyms, also it is more international in its governance, head office in Copenhagen.
https://www.gbif.org/species/search?q=sterculiaceae
https://en.wikipedia.org/wiki/Global_Biodiversity_Information_Facility

I had a look and I think that the structure of the result is a lot more usable than results from EOL.
http://api.gbif.org/v1/species/match?verbose=false&name=Abies%20argentea

pity the result contains no intermediate taxonomic information between family and genus (see Vanda where I expect some reference to Subfamilia Epidendroideae, Tribus Vandeae, Subtribus Aeridinae or Sterculiaceae where I expect some reference to Sterculioideae) nor between genus and species (see Rhododendron farrerae: subgenus Azaleastrum, sectio Tsutsusi, subsectio Brachycalyx)

@mfrasca
Copy link
Member Author

mfrasca commented Nov 1, 2018

but this one contains no information about hybrids!

@mfrasca
Copy link
Member Author

mfrasca commented Nov 1, 2018

I think that hybrid information is relevant enough, missing it: no go.

@mfrasca mfrasca closed this as completed Nov 1, 2018
@thomasstjerne
Copy link

Hello @mfrasca
GBIF taxonomy includes hybrids, where this information is available from sources:
http://api.gbif.org/v1/species/match?name=Pilosella%20officinarum%20x%20Pilosella%20piloselloides%20subsp.%20bauhinii

When using the species match API, you should make another call to the species API with the usageKey
to get all information:
http://api.gbif.org/v1/species/9645296

Cheers,
Thomas Stjernegaard

@mfrasca
Copy link
Member Author

mfrasca commented Nov 1, 2018

Hi @thomasstjerne, thank you for the hybrid formula example!
but what about nothotaxa? like Brassocattleya.
compare gbif with tpl with wikipedia

@MattBlissett
Copy link

Hi @mfrasca,

Nothotaxa work where we have the data, but we usually follow the Catalogue of Life which unfortunately says it's not a hybrid. You can see the three checklists we see the name (×) Brassocattleya arauji in, including TPL and IPNI showing hybrids. The website runs from the public APIs, so it would be possible to use GBIF's species match API, retrieve the …/related names, then choose the TPL one where it exists (i.e. look for a name from dataset d9a4eedb-e985-4456-ad46-3df8472e00e8). GBIF's matching API only runs against the GBIF backbone, though others have requested that we allow matching to the checklists we index, and we plan to do this.

Also, I think the export of TPL we have could be improved. It doesn't have genus or family names, and I don't think most hybrid names are formatted correctly.

×Brassocattleya fregoniana we have as a hybrid, since it's not in the Catalogue of Life. (nameType=HYBRID is only for hybrid formulae, I think you just need to see the × for nothotaxa.)

We don't have any intermediate ranks.

If testing the matching API, provide kingdom=Plantae to avoid possible homonyms in other kingdoms.

You had one other comment in your email, which I hope you don't mind me quoting here:

I don't find the example, and I might be confused with a different source, but I thought you sometimes provide year of publication, and again not as a separate field.

I think you've probably read our API documentation which links to some zoological examples. The year is part of the standard format for a zoological name: Puma Jardine, 1834 — or else it was the publishedIn field on any name where we have it.

In case it's useful for assessing its suitability, you can download the GBIF backbone checklist as a Darwin Core archive (zipped TSVs) from https://doi.org/10.15468/39omei , and filter for kingdom=6 (Plantae).

We are working on better integration with the Catalogue of Life, and this is likely to include a new version of our checklist API. I don't think there are any new issues here, but I'll tag @mdoering (the developer for GBIF and CoL) anyway.

Thanks!

@mfrasca
Copy link
Member Author

mfrasca commented Nov 1, 2018

what a huge amount of information! thank you warmly!

  • GBIF backbone, I've downloaded the checklist, I'll study it when I have the chance, hopefully soon enough. and I guess I should import this into a SQL database, or it's too much data work with.
  • year of publication, I see you have it sometimes in publishedIn, sometimes as part of the scientificName and authorship, but the two things seem to be unrelated.
  • subfamiliae: do you plan adding that? this is also related to having names like Malvaceae twice, once sensu stricto and then again sensu lato. me as non-botanist (I'm a mathematician), I don't think it's correct to say that any of the families synonym of one of the subfamilies Bombacoideae, Brownlowioideae, Byttnerioideae, etc. are synonyms of Malvaceae.
  • nothotaxa: it is curious to see the × before a genus epithet while looking for a species, but not in the genus epithet when checking the genus itself. I would like to consider the entry for the genus authoritative for the genus epithet.
  • nothotaxa bis: I have a list with 26k genus epithets, 547 of which are marked as hybrid (nothotaxa), I downloaded the json information from your api relative to these 547 genera, and only 21 of them have a × prefix in your database. anyhow, I would rather have an explicit field telling me it's a hybrid name.
  • nothotaxa ter: your database includes infraspecific taxa with the indication 'nothovar.', 'nothosubsp.', 'nothof.'. does this help, or does this complicate using your data, I'm not sure yet, but I would definitely reconsider it, once the database included an hybrid field for nothotaxa.
  • provide kingdom=Plantae : 👍
  • related: interesting, but since TPL provides an API which I already query from the software, I guess I won't use that. I am looking into offering alternatives to the user, who's then going to choose which to accept.

@mfrasca mfrasca reopened this Nov 1, 2018
@mfrasca mfrasca changed the title consider GBIF as taxonomic source consider GBIF as (alternative) taxonomic source Nov 1, 2018
@tmyersdn
Copy link

tmyersdn commented Nov 2, 2018

Hi @mfrasca -

TPL - something to note is that TPL is not updated very often (current Version 1.1 in September 2013 - see the details on the home page).

@mdoering
Copy link

mdoering commented Nov 2, 2018

Hi @mfrasca, we do dedicate a property notho to Names in both GBIF and the upcoming CoL+ Clearinghouse. But It is well hidden in GBBIF I can see, we do not expose it in the "NameUsage" aka taxon/species, just in its "Name":
https://api.gbif.org/v1/species/3651890/name
https://api.gbif.org/v1/species/3651890

@mfrasca
Copy link
Member Author

mfrasca commented Nov 2, 2018

hallo @mdoering and @MattBlissett! still on nothotaxa: some time ago I had merged genus information from ars-grin with the default database from Bauble. I collected 26924 genus epithets, of which 547 marked as nothotaxa. I just checked my merged collection of genus epithets against ipni through kew's 'reconciliation'. 'reconciliation' gives a reply on 25702 epithets (of 26924), it reports 121 nothotaxa, none of them is in my initial 547 set (so I need to review). in particular, just to spotlight something which is obviously incorrect, only 40 genera in the Orchidaceae family have a hybrid marker in your database, while I have 472 marked thus.
you can check my list from bauble/plugins/plants/default/genus.txt and then grep '"x .*345' genus.txt. (345 is the id of family Orchidaceae).

@mfrasca
Copy link
Member Author

mfrasca commented Nov 2, 2018

(edited the above, I had made a couple of mistakes matching the two lists)

mfrasca added a commit that referenced this issue Nov 2, 2018
mfrasca added a commit that referenced this issue Nov 7, 2018
@mfrasca mfrasca added the taxonomy taxonomy-related label Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
taxonomy taxonomy-related
Projects
None yet
Development

No branches or pull requests

5 participants