Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set occurrenceStatus to "doubtful" for outliers #1

Open
peterdesmet opened this issue Apr 26, 2018 · 7 comments
Open

Set occurrenceStatus to "doubtful" for outliers #1

peterdesmet opened this issue Apr 26, 2018 · 7 comments

Comments

@peterdesmet
Copy link
Member

At the workshop we proposed to have a quality flag (an "occurrenceVerificationFlag") in occurrenceRemarks, based on terminology in Andre Steckenreuter et al. (2016).

I suggest that for those records with a low quality flag (i.e. we're pretty sure it's a ghost detection or an outlier) to also set occurrenceStatus to doubtful . GBIF/OBIS is already using that field to parse occurrences that should not be harvested/put on maps, which means they don't have to sift through occurrenceRemarks and understand the vocab used there to understand how doubtful a record is.

The definition for doubtful in the occurrence status controlled vocabulary is:

The taxon is scored as being present in the area but there is some doubt over the evidence. The doubt may be of different kinds including taxonomic or geographic imprecision in the records.

If there is agreement to do this, I suggest to make occurrenceStatus a required field (that should make @albenson-usgs happy 😄), with default value present.

Side note: My initial thought was to use "absent", but the definition for that is that there is evidence that the taxon is not there. That is not what a ghost record or outlier is.

@albenson-usgs
Copy link
Contributor

Yes, this sounds like a good working solution to me. @jdpye and @peggynewman would you agree?

@jdpye
Copy link
Member

jdpye commented Apr 30, 2018

Agree with classifying our detections this way. I went looking for a 'probable' field in the nomenclature and it seems to me that the field appears to be saying something about whether the species can generally be found in a location, not referring to individual sightings or detections. The question is, how are people actually using it. In applying it to an individual detection, it doesn't look like we'd be creating too much confusion, and 'doubtful' is a nice way to classify our faith in a phantom detection. Not impossible, but varying degrees of unlikely.

So long as we can be sure that someone stumbling across this data won't be confused by our use of this field, it lines up fairly well for me with how we want to classify our QCed detections.

@peggynewman
Copy link

It looks like 'excluded' is an option in that vocabulary too. I remember somebody suggesting that most studies clean out non-animal detections (eg. by removing range test detections). Maybe this is a practical value to include if for some reason they are left in the data?

@peterdesmet
Copy link
Member Author

I have thought about “excluded” as well, but that is (just like absent) possitively stating that a record is not there, but it has (e.g. in gray literature) reported there as present in the past. I don’t think that fits our definition here.

Note: it doesn’t mean: to be excluded from analysis/dataset.

@Antonarctica
Copy link

Hi
Would this stil be used in combination with a quality flag (an "occurrenceVerificationFlag") in occurrenceRemarks, based on terminology in Andre Steckenreuter et al. (2016)? Still using that would make sense to me.

For the proposed use of "doubtful" I think it is a good working solution.

I do get the sense from the definition that with some GPS tags on birds/mammals you might be pushing the definition of Present a bit, but happy to accept that as well.

@phwalsh
Copy link

phwalsh commented May 2, 2018

I agree use of occurenceStatus is a good idea.

@Antonarctica, my understanding is that both would be used. occurenceStatus for existing use case by GBIF/OBIS and occurrenceVerificationFlag to include Andre et al QC flag (note this is only used for acoustic data).

@jdpye
Copy link
Member

jdpye commented May 2, 2018

Agree we'd set both, and that for acoustics, doubtful would apply to a range of QC values from the QC procedure.

This has the benefit of flexibility when it comes to ranking the occurrences obtained by other tag location methods. GPS and pop-off and light-based locating can each have their own confidence levels and we can map the more dubious ones to 'doubtful' in the occurrenceStatus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants