Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flagging occurrence records where community feedback has identified wrong or doubtful identifications #4187

Open
CecSve opened this issue Jul 27, 2022 · 2 comments
Labels

Comments

@CecSve
Copy link

CecSve commented Jul 27, 2022

There appears to be a genuine need for data users to provide feedback on occurrence records with wrongly or doubtful identifications in GBIF.org, mainly coming from citizen science data sources where none to some data quality checks happen at the source. The need seems to be especially coming from the expert community of hard-to-identify taxa, such as insects.

Ideas for potential implementation

  • integrate feedback option in the portal system where users can comment on wrongly identified occurrence records directly to GBIF (this would require a documented pipeline for GBIF on how to handle such reportings, e.g. categorizing issues into standardised flags or issues).
  • images from occurrence records where the users have reported the identification to be wrong or doubtful should have a disclaimer ribbon attached stating something along the lines of: 'identification issues', 'wrong identification' and potentially have a more elaborated text description associated with the record based on the comments from the user. Such a disclaimer could be linked to a specific flag or issue.
  • any flags or tags that appear on the portal should be included in the (DwC-A) download - perhaps under the flags and issues column
  • auto-assign such ribbons/tags etc. based on whether mutltiple identifications were made at the data source - not sure how programmable it is, but it could be one tag ('multiple identifications exists at data source') or something.

Potential issues for implementation

  • if GBIF includes curation by tagging and flagging occurrences, how should those modifications be dealt with when the datasets are re-indexed? It should be possible to remove flag and tags if the occurrence is updated and the issue no longer persist, but it should also not automatically be removed upon indexing.
  • auto-assigning flags and issue from citizen science portals may be quite challenging, as some portals have sections that highlights the various identifications, e.g. iNaturalist, while others only have the information in the comments section, e.g. Naturglucker.

@MortenHofft do you have any thoughts on these ideas and do you know other informatics people who would like to provide feedback? @jhnwllr and @ahahn-gbif are working on something similar to this, I am aware. Please provide any feedback or add to this if there is something I have missed.

@CecSve
Copy link
Author

CecSve commented Jul 27, 2022

May relate to this issue: gbif/registry#247

@MortenHofft
Copy link
Member

MortenHofft commented Aug 15, 2022

This is a quick brain dump. It is a recurring theme. Normally we use the term annotations. Multiple ideas have been floated over time.

  • Annosys have been used for some publishers (we saw close to no annotations - perhaps 3 over 1 year. I was one of them.)
  • We tried briefly labelling occurrences that had a GitHub issue attached to them (but was blocked by Github as we did too many requests)
  • We have discussed rule based annotations. E.g. where you can create a filter, and then draw a polygon and say: all these have wrong classification. It doesn't live there. And then that rule applies going forward.
  • We have discussed a dedicated website. e.g. fix.gbif.org that allowed community annotations which could then be used to enrich gbif.org and allow for filtering. Essentially an: according the community this value should be X. Just like we do with GADM and our existing machine interpretations.
  • We have discussed allowing publishers to opt in to being interested in annotations.

Some of the challenges are:

  • how to reconcile views. If the publisher looks through the feedback and disagrees, then what happens. How do we know that it has been addressed. And who gets to decide
  • How to make it motivating. Feedback shouldn't just be lost/left to die. If no one listens and update the record then what?
  • Time/priority
  • Do we want to be the ones building that system
  • Is enough people actually interested in using it - is there enough of an audience?
  • Others in our community have implemented this in the past. We should probably try to understand their attempts. Or can/should we use them somehow?

I guess no one provided a clear thought through model of how this should work. What is the flow of an annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants