NEFBMP2012-19 - Duplicates in source data #1

MelinaHoule · 2022-06-08T16:04:49Z

Duplicates are based on Location, Date/Time, Species, Abundance and Protocols (distance/duration).

NEFBMP2012-19 has duplicates in the source data (sheet : Bird Data)

Example: Point_number = 601;
Observer: Wildgust, Allon;
Date: 2015-06-20;
species: BADO ;

We treat them as duplicates for now.

Waiting to hear back from our Partner to validate they are real duplicates or if we should add them up to make an abundance of 2.

MelinaHoule · 2022-06-22T13:50:12Z

Answer from the data partner:
"I don't believe these are duplicate entries--all data were checked with original field sheets after data entry occurred, so these raw data should be summed. Of course I can't rule out the possibility that there was only one BADO which was double-entered, and then that error was missed during the error-checking process, but that would be an isolated occurrence and very unlikely to happen."

Duplicates occur 10% of the time. It can't be considered as isolated. I propose to sum those rows.

Another case of duplicate exist: 24 rows have identical attributes with the exception of detection_cues. Since detection cues is recorded in the extended table, I propose that we sum the abundance in the survey table, but split them apart in the extended table to record the proper behavior. Abundance attributes is found in both table. To avoid confusion, we may need to rename abundance in the extended table to reflect that difference.

MelinaHoule added the Temporary upload Uploaded data that have unanswered issues label Jun 8, 2022

MelinaHoule self-assigned this Jun 8, 2022

MelinaHoule added the duplicate This issue or pull request already exists label Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEFBMP2012-19 - Duplicates in source data #1

NEFBMP2012-19 - Duplicates in source data #1

MelinaHoule commented Jun 8, 2022

MelinaHoule commented Jun 22, 2022

NEFBMP2012-19 - Duplicates in source data #1

NEFBMP2012-19 - Duplicates in source data #1

Comments

MelinaHoule commented Jun 8, 2022

MelinaHoule commented Jun 22, 2022