New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot load huggingface:imagenet-1k dataset due to parse error #5105
Comments
Hello @christian-steinmeyer ! As HF and TFDS have different naming rules, you will have to adapt the dataset name to follow TFDS' naming: in this case, the correct name would be As a pointer, you can refer to the datasets/tensorflow_datasets/core/dataset_builders/huggingface_dataset_builder.py Line 124 in 3e6bec0
We will update our documentation so that this is clearer for users! |
That worked, thanks! And yes, an update in the documentation would be very helpful! |
@ccl-core Quick follow-up question: Downloading the dataset worked - however, after generating splits, the |
Hi again! I found the tfds.load(
'huggingface:imagenet_1k',
data_dir=IMAGE_DIR,
shuffle_files=True,
builder_kwargs={"tfds_num_proc": N_JOBS}
) In the meantime, my original try ran through (without |
Same problem here. It runs ~20 examples/s and eventually after a day or so it crashes. |
Short description
When following the instructions here, I cannot download the imagenet-1k dataset from huggingface.
Environment information
Operating System: macOS Sonoma 14.0 (23A344)
Python version: 3.10.10
tensorflow-datasets
/tfds-nightly
version: 4.9.3.dev202310060044tensorflow
/tf-nightly
version: 2.13.0Does the issue still exists with the last
tfds-nightly
package (pip install --upgrade tfds-nightly
) ? YesReproduction instructions
Stacktrace
Expected behavior
The parser properly parses the string given in the documentation and downloading the dataset succeeds.
Additional context
The text was updated successfully, but these errors were encountered: