New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an easier way of adding datasets #1507
Comments
Thank you for your feature request and for championing open science. We wholeheartedly share your commitment to transparency, as reflected in our open-source pipeline. Our overarching goal is to harness collective data to continually refine and improve models. To start the discussion, this is what already is available:
We're always open to more suggestions and input, so let's explore even more ways to interact with our users' data! |
What I meant is, I created some datasets and would like to make them easily accessible to other researchers. Importing via url is fine, but it doesn't display much information about the dataset. Having the option to import something like a json file with some metadata to be displayed would be informative for the user, much like in the screenshot above. Also, implementing something like the benchmark datasets part but for community datasets would be great. Being more clear, having a tab with Not only to be used to create a systematic review per definition, but also as a way of finding new papers in an area of interest that were just published in a conference, which is more of my use case. So, for example, new conference makes papers available, someone creates a dataset with information from this conference, and users could readily access this information and start reviewing new papers of interest. |
Yes, I totally understand! I just wanted to create an overview what is already possible :-) |
I already cited the template in the issue description. My problem with this approach is having to create a package that need to be installed with pip. I don't think a dataset has that much of information and functionalities to need a package for itself. It should be something more smooth, like creating a yml or json file with metadata and pointing to where the real data should be downloaded from. Note that this solution is already implemented for json files in the BenchmarkDataGroup. I think it just need to be available for Oracle mode and documented. |
This is a great idea. Never thought about this somehow. We are welcoming contributions to this, and our team is also interested in implementing this. |
Feature Request
Is your feature request related to a problem? Please describe.
Currently adding a dataset to be used by other users requires parting from a given template and then pip installing the dataset.
Describe the solution you'd like
An easier way of adding an (un)labeled dataset to be used in ASReview.
Describe alternatives you've considered
Maybe something like adding from URL, but giving a link to a json file like the one used in BenchmarkDataGroup
Teachability, Documentation, Adoption, Migration Strategy
After reading the info from the json file, it could display some information like the one exhibited in the Benchmarks dataset panel, but also for unlabeled datasets:
The text was updated successfully, but these errors were encountered: