Introducing ASReview-Preprocess extension #1406
Replies: 1 comment 1 reply
-
I'm glad to see that you're focusing on data quality in the ASReview-preprocess extension. As a statistician, I couldn't agree more with the importance of high-quality data when it comes to active learning for text screening. The famous statement "garbage in, garbage out" holds true, and I always encourage high-quality data input for high-quality results. The ASReview-Datatools repository contains various data preprocessing tools, including a simple deduplication script based on DOI, title, and abstract (https://github.com/asreview/asreview-datatools). However, more advanced deduplication strategies could undoubtedly be beneficial. In addition, retrieving missing data is crucial for data quality. We apply a similar strategy for the benchmark datasets (https://github.com/asreview/systematic-review-datasets), and a new release incorporating this strategy is coming soon. Your work on the ASReview-preprocess extension is highly appreciated, and it's great to see that you're following similar paths. Have you considered merging your developments into the Datatools package? We're open to discussing the possibility of integrating your work or understanding if there are reasons for keeping the projects separate. Parallel development is always an option, and we're eager to collaborate and support each other's efforts. Looking forward to hearing your thoughts!! |
Beta Was this translation helpful? Give feedback.
-
Hello all,
I am working on the ASReview-preprocess extension for catering to all preprocessing needs such as finding missing data for records and deduplication. The purpose of this extension is to make the research accessible to those who cannot access paid resources and tools such as Endnote.
The current functionality includes CLI for:
Working on and looking for collaborators/contributors for:
Please check the repository and give suggestions in the discussion section of the repository. PRs are very much welcome.
Note: The extension is under development and is not ready to be used in the ASReview workflow just yet.
Beta Was this translation helpful? Give feedback.
All reactions