Set occurrenceStatus to "doubtful" for outliers #1

peterdesmet · 2018-04-26T14:10:23Z

At the workshop we proposed to have a quality flag (an "occurrenceVerificationFlag") in occurrenceRemarks, based on terminology in Andre Steckenreuter et al. (2016).

I suggest that for those records with a low quality flag (i.e. we're pretty sure it's a ghost detection or an outlier) to also set occurrenceStatus to doubtful . GBIF/OBIS is already using that field to parse occurrences that should not be harvested/put on maps, which means they don't have to sift through occurrenceRemarks and understand the vocab used there to understand how doubtful a record is.

The definition for doubtful in the occurrence status controlled vocabulary is:

The taxon is scored as being present in the area but there is some doubt over the evidence. The doubt may be of different kinds including taxonomic or geographic imprecision in the records.

If there is agreement to do this, I suggest to make occurrenceStatus a required field (that should make @albenson-usgs happy 😄), with default value present.

Side note: My initial thought was to use "absent", but the definition for that is that there is evidence that the taxon is not there. That is not what a ghost record or outlier is.

The text was updated successfully, but these errors were encountered:

albenson-usgs · 2018-04-30T16:54:57Z

Yes, this sounds like a good working solution to me. @jdpye and @peggynewman would you agree?

jdpye · 2018-04-30T17:26:02Z

Agree with classifying our detections this way. I went looking for a 'probable' field in the nomenclature and it seems to me that the field appears to be saying something about whether the species can generally be found in a location, not referring to individual sightings or detections. The question is, how are people actually using it. In applying it to an individual detection, it doesn't look like we'd be creating too much confusion, and 'doubtful' is a nice way to classify our faith in a phantom detection. Not impossible, but varying degrees of unlikely.

So long as we can be sure that someone stumbling across this data won't be confused by our use of this field, it lines up fairly well for me with how we want to classify our QCed detections.

peggynewman · 2018-04-30T20:11:05Z

It looks like 'excluded' is an option in that vocabulary too. I remember somebody suggesting that most studies clean out non-animal detections (eg. by removing range test detections). Maybe this is a practical value to include if for some reason they are left in the data?

peterdesmet · 2018-04-30T22:29:32Z

I have thought about “excluded” as well, but that is (just like absent) possitively stating that a record is not there, but it has (e.g. in gray literature) reported there as present in the past. I don’t think that fits our definition here.

Note: it doesn’t mean: to be excluded from analysis/dataset.

Antonarctica · 2018-05-02T07:26:09Z

Hi
Would this stil be used in combination with a quality flag (an "occurrenceVerificationFlag") in occurrenceRemarks, based on terminology in Andre Steckenreuter et al. (2016)? Still using that would make sense to me.

For the proposed use of "doubtful" I think it is a good working solution.

I do get the sense from the definition that with some GPS tags on birds/mammals you might be pushing the definition of Present a bit, but happy to accept that as well.

phwalsh · 2018-05-02T07:37:38Z

I agree use of occurenceStatus is a good idea.

@Antonarctica, my understanding is that both would be used. occurenceStatus for existing use case by GBIF/OBIS and occurrenceVerificationFlag to include Andre et al QC flag (note this is only used for acoustic data).

jdpye · 2018-05-02T12:13:29Z

Agree we'd set both, and that for acoustics, doubtful would apply to a range of QC values from the QC procedure.

This has the benefit of flexibility when it comes to ranking the occurrences obtained by other tag location methods. GPS and pop-off and light-based locating can each have their own confidence levels and we can map the more dubious ones to 'doubtful' in the occurrenceStatus.

peterdesmet added the use case: Mahoney label Oct 24, 2018

peterdesmet mentioned this issue Oct 24, 2018

how to define species observation from non-free-ranging animals #5

Open

timrobertson100 mentioned this issue May 15, 2020

Improve handling of records declaring absence data gbif/pipelines#268

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set occurrenceStatus to "doubtful" for outliers #1

Set occurrenceStatus to "doubtful" for outliers #1

peterdesmet commented Apr 26, 2018

albenson-usgs commented Apr 30, 2018

jdpye commented Apr 30, 2018

peggynewman commented Apr 30, 2018

peterdesmet commented Apr 30, 2018

Antonarctica commented May 2, 2018

phwalsh commented May 2, 2018

jdpye commented May 2, 2018

Set occurrenceStatus to "doubtful" for outliers #1

Set occurrenceStatus to "doubtful" for outliers #1

Comments

peterdesmet commented Apr 26, 2018

albenson-usgs commented Apr 30, 2018

jdpye commented Apr 30, 2018

peggynewman commented Apr 30, 2018

peterdesmet commented Apr 30, 2018

Antonarctica commented May 2, 2018

phwalsh commented May 2, 2018

jdpye commented May 2, 2018