Segmented screening to improve ASReview performance for transdisciplinary SLRs? #1580

TimothyMarcroft · 2023-11-13T15:14:49Z

TimothyMarcroft
Nov 13, 2023

Hello ASReview community,

PhD student in the social sciences here. I joined the summer school put on by Utrecht University a few months ago and I am now working on the methodology for my very first systematic literature review. I plan to use ASReview during the screening process, but I am a little bit concerned that my specific context and research question will make the tool less effective than it would otherwise be. The thing is, my research question is quite pragmatic and my approach is transdisciplinary. This means that there are many different ways to refer to my objects of study out there in the literature, with each discipline often having its own lexicon. I want to include all of the papers that discuss an instance of a real-world phenomenon, not just those that use a single set of vocabulary to describe it. So, compared to a relatively mono-discipline SLR, the papers I want to include will be more linguistically different from one another which would seem like it could cause performance issues for ASReview. I have an idea of how to overcome this challenge while still getting good value out of ASReview: journal-segmented screening.

In this approach, I would perform a relatively broad search in several databases, combine them, deduplicate, and then segment my screening phase by journal. So, I would create a separate .csv file and do a separate screening procedure for each different journal with each following the same stopping criteria. My logic here is that journals are probably the best proxy I'm going to find for disciplinary affiliation with any ease, and that (on average) the vocabulary used by a given article is going to be more similar to those published within that journal than those published in a different one. This should improve the performance of ASReview, if I'm understanding things correctly. I would then merge the resulting labeled datasets once I had finished and deduplicate (although I would expect few duplicates at this stage).

One problem I see with this approach is that I can't be sure that there are any relevant records in every journal. In fact, I can be quite sure that some of the journals will include zero relevant records, making the selection of a stopping rule more complicated. Perhaps this is solveable, but I'm not sure.

Does this seem like an approach that would be worth the extra effort? Is the SAFE procedure strong enough, by combining multiple models, to not need this extra complication? I would love some feedback from people who have some experience with the tool and a deeper technical undestanding of it than I do. Thanks!

jteijema · 2023-11-15T13:14:53Z

jteijema
Nov 15, 2023
Collaborator

Hi @TimothyMarcroft,

I am partly copying my answer from #1547.

Some models we have available in ASReview are context based classification models as opposed to vocabulary based. These models are much more context dependent and have proven to be resistant to divergent terminology.

To simplify a little (a lot), these models will analyze the contextual usage of every word compared to every other word. This means that if two terms exist for the same concept but are used in similar contexts, they will end up closer in the embedding space. This is in contrast to simpler techniques like TF-IDF, which rely more on the frequency of individual terms and may not capture such nuances, which means having to rely on the classification model to make those connections.

In practical terms, if you are dealing with a corpus that has a lot of heterogeneous terminology, more advanced models like doc2vec or sBERT may provide more accurate representations of the underlying semantic structures. These models can capture the semantic similarity between different terms that are contextually similar, even if they do not appear the same.

Keep in mind that sBERT will take some time to train, but in your situation I would highly recommend it's use as feature extractor.

3 replies

TimothyMarcroft Nov 15, 2023
Author

Hi @jteijema,

Ok, this makes sense and reinforces the idea that the SAFE procedure should be my starting point, so that I can start with a simpler model and then use the resulting labels to train a more advanced one. Thanks!

FelixWdm Nov 17, 2023
Collaborator

hi @TimothyMarcroft, in addition, what works best depends on the total number of items and the ratio between relevant/irrelevant in the set. another quick solution would be to select a query strategy wiht 5% random records. The SAFE procedure assumes you'll have a lot of prior knowledge to train the advanced model. If you're working with a smaller data set, the query strategy might be a quick and safe alternative:

j0sien Nov 22, 2023

Hi @FelixWdm could you explain what you mean here?
So in case you have a smaller set of items, e.g. 500
still start with steps S and A of the SAFE procedure?
And then instead of going to F(ind more relevant records using deep learning); you would do the 5% random records query strategy?
Can you explain how this would help to find more relevant records?
Or is this already the E(valuatie Quality) step?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmented screening to improve ASReview performance for transdisciplinary SLRs? #1580

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Segmented screening to improve ASReview performance for transdisciplinary SLRs? #1580

TimothyMarcroft Nov 13, 2023

Replies: 1 comment · 3 replies

jteijema Nov 15, 2023 Collaborator

TimothyMarcroft Nov 15, 2023 Author

FelixWdm Nov 17, 2023 Collaborator

j0sien Nov 22, 2023

TimothyMarcroft
Nov 13, 2023

Replies: 1 comment 3 replies

jteijema
Nov 15, 2023
Collaborator

TimothyMarcroft Nov 15, 2023
Author

FelixWdm Nov 17, 2023
Collaborator