Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you interested in publishing to huggingface/datasets ? #24

Open
richarddwang opened this issue Apr 7, 2021 · 4 comments
Open

Are you interested in publishing to huggingface/datasets ? #24

richarddwang opened this issue Apr 7, 2021 · 4 comments

Comments

@richarddwang
Copy link

It's a little bit hard for Pytorch users to evaluate their models on the benchmark.

Are you willing to import your datasets to huggingface/datasets ?
There are detailed steps about how to add a dataset. (https://huggingface.co/docs/datasets/add_dataset.html), and it shouldn't be hard since you can refer to the processing scripts of other datasets.

If this benchmark can be imported to huggingface/datasets, which then provides use for Numpy/Pandas/PyTorch/TensorFlow/JAX, I believe it will become more accessible and prevailed.

@MostafaDehghani
Copy link
Collaborator

Agree that it's a great idea, but we are a bit out of cycle for doing this. I added this to the list of TODOs, but it's a bit unlikely that we get to it any time soon.

@alexmathfb
Copy link

@richarddwang I may end up re-writing LRA for PyTorch. In that case, I'd be happy to port the datasets to huggingface.

Q. Which types of test cases do you think adequately tests the code?

For example. I envision a file that loops through the JAX dataloader and the PyTorch dataloader to check the output is identical.

@vanzytay
Copy link
Collaborator

@alexmathfb this sounds great.

@richarddwang
Copy link
Author

@alexmathfb Sorry for the late reply
That would be great!!
BTW I recommend creating an issue or a draft pr on HF/datasets, ppl there are willing and able to provide precise support for you to port the datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants