-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta-Dataset in TFDS: Getting as_numpy_iterator() from dataset returned from api.meta_dataset takes a very long time #83
Comments
I believe most of that time is spend reading data and filling shuffle buffers. Each training class in each training source is instantiated as its own dataset with its own shuffle buffer (this is how examples are sampled from specific classes to form episodes), and by default in What happens if you set |
Thanks for looking into this. I did make the change that you suggested ( Starting TFDS reader. Thus creating an iterator with a single dataset is acceptably fast, but creating an iterator over multiple datasets (so you can meta-train on MDv2) is unacceptably slow. As I mentioned above, the time seems to be spent in : This is a major blocker for us. If I can help debug in any way, I would be happy to. John |
I am trying to use the new Meta-Dataset in TFDS APIs, and I have hit a critical performance problem.
When I run the sample code to "Train on Meta-Dataset episodes" (with some added lines to record times), it takes about 3 or 4 minutes to create the dataset
api.meta_dataset
and roughly 1 hour to create the iteratorepisode_dataset.take(4).as_numpy_iterator()
. Here is the code I am running:I have run this on Linux and Windows with similar results. The time seems to be spent in:
in the file gen_dataset_ops.py file which drops into C++ code that I didn't debug into.
Note that creating an iterator on
api.episode_dataset
for evaluation is reasonably quick - omniglot takes the longest at about 3 minutes, but the others take only a few seconds.This issue makes training on MDv2 from TFDS more or less impossible.
The text was updated successfully, but these errors were encountered: