Skip to content

Duplicates in datafile #860

Answered by boer0107-zz
Aminaodw asked this question in Q&A
Nov 29, 2021 · 1 comments · 4 replies
Discussion options

You must be logged in to vote

We have not done research into the effect of duplicates on the performance of ASReview. However we expect that the risk is mainly in the records that you include and that have duplicates.

First of all you will have to include them twice (you probably will never see the duplicate of an exclusion because it is pushed to the back of your set). But this is only inconvenient, it does not harm your results.

More important, a duplicate inclusion gets more weight then an inclusion without duplicates. This might have a negative impact on the performance. For instance if inclusions with duplicates represent a specific subset of your results, this subset will be more prominent in your inclusions bec…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@Aminaodw
Comment options

@mruderman
Comment options

@MirushHRD
Comment options

@Rensvandeschoot
Comment options

Answer selected by Rensvandeschoot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants