Irrelevant records #1608

HJalink · 2023-12-12T06:56:38Z

HJalink
Dec 12, 2023

For my PhD I'm conducting a systematic review on clinical trials in disease Y. A lot of the articles in my dataset are, however, things like supplementary articles, poster communications, abstracts, clinical trial registrations etc. I've marked these as irrelevant, but now I'm worried that in this process, AI also eradicates some important papers, as it does not know why I've marked these irrelevant.

For example. ASReview provided me with a new article, a clinical trial with the therapeutic X. As this was a post communication - and not a full research article, I've marked this paper irrelevant. However, ever since I've not actually seen an original fulltext article with therapeutic X pass ASReview, though when I simply google this substance - there have been 2 trials published (which based on the tiab should be included in my database). As I'm currently on 20-irrelevant-or-something streaks, I'm kinda wondering if I've made a mistake?

Can anyone here confirm my hypothesis that by marking ASReview articles based on non-tiab content, I'm actually losing articles?

Answered by Rensvandeschoot

Dec 12, 2023

Thank you for using ASReview! The machine learning model takes the texts of the labeled abstracts to predict relevance scores for the unseen records. If you exclude papers based on information not accessible to the machine, for example, publication type, the model indeed gets confused. This happens if, for example, a poster contains relevant text, but you exclude it because it is a poster. I would recommend labeling such a record relevant based on its content (to help the model) and adding a note in the notes field: "exclude because poster." This note field is exported; you can filter out such records later. This way, you train the model with correct information. Also, you can add the num…

View full answer

Rensvandeschoot · 2023-12-12T07:20:58Z

Rensvandeschoot
Dec 12, 2023
Maintainer

Thank you for using ASReview! The machine learning model takes the texts of the labeled abstracts to predict relevance scores for the unseen records. If you exclude papers based on information not accessible to the machine, for example, publication type, the model indeed gets confused. This happens if, for example, a poster contains relevant text, but you exclude it because it is a poster. I would recommend labeling such a record relevant based on its content (to help the model) and adding a note in the notes field: "exclude because poster." This note field is exported; you can filter out such records later. This way, you train the model with correct information. Also, you can add the number of exclusions based on publication type in your PRISMA flow chart.

1 reply

HJalink Dec 12, 2023
Author

Thank you for the very clear explanation and solution!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Irrelevant records #1608

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Irrelevant records #1608

HJalink Dec 12, 2023

Replies: 1 comment · 1 reply

Rensvandeschoot Dec 12, 2023 Maintainer

HJalink Dec 12, 2023 Author

HJalink
Dec 12, 2023

Replies: 1 comment 1 reply

Rensvandeschoot
Dec 12, 2023
Maintainer

HJalink Dec 12, 2023
Author