Irrelevant records #1608
-
For my PhD I'm conducting a systematic review on clinical trials in disease Y. A lot of the articles in my dataset are, however, things like supplementary articles, poster communications, abstracts, clinical trial registrations etc. I've marked these as irrelevant, but now I'm worried that in this process, AI also eradicates some important papers, as it does not know why I've marked these irrelevant. For example. ASReview provided me with a new article, a clinical trial with the therapeutic X. As this was a post communication - and not a full research article, I've marked this paper irrelevant. However, ever since I've not actually seen an original fulltext article with therapeutic X pass ASReview, though when I simply google this substance - there have been 2 trials published (which based on the tiab should be included in my database). As I'm currently on 20-irrelevant-or-something streaks, I'm kinda wondering if I've made a mistake? Can anyone here confirm my hypothesis that by marking ASReview articles based on non-tiab content, I'm actually losing articles? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thank you for using ASReview! The machine learning model takes the texts of the labeled abstracts to predict relevance scores for the unseen records. If you exclude papers based on information not accessible to the machine, for example, publication type, the model indeed gets confused. This happens if, for example, a poster contains relevant text, but you exclude it because it is a poster. I would recommend labeling such a record relevant based on its content (to help the model) and adding a note in the notes field: "exclude because poster." This note field is exported; you can filter out such records later. This way, you train the model with correct information. Also, you can add the number of exclusions based on publication type in your PRISMA flow chart. |
Beta Was this translation helpful? Give feedback.
Thank you for using ASReview! The machine learning model takes the texts of the labeled abstracts to predict relevance scores for the unseen records. If you exclude papers based on information not accessible to the machine, for example, publication type, the model indeed gets confused. This happens if, for example, a poster contains relevant text, but you exclude it because it is a poster. I would recommend labeling such a record relevant based on its content (to help the model) and adding a note in the notes field: "exclude because poster." This note field is exported; you can filter out such records later. This way, you train the model with correct information. Also, you can add the num…