Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backend-agnostic worker-process data loading #19692

Closed
LarsKue opened this issue May 8, 2024 · 8 comments
Closed

Add backend-agnostic worker-process data loading #19692

LarsKue opened this issue May 8, 2024 · 8 comments
Assignees
Labels
stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@LarsKue
Copy link
Contributor

LarsKue commented May 8, 2024

Worker-process data loading is an integral part of many training applications. Exposing the API provided in keras.utils.PyDataset to backends other than tensorflow would be a valuable addition.

As an alternative, flags like workers, use_multiprocessing, and max_queue_size could be added to Model.fit, but this may be confusing when the user passes data that is fully in memory.

@fchollet
Copy link
Member

fchollet commented May 8, 2024

Exposing the API provided in keras.utils.PyDataset to backends other than tensorflow would be a valuable addition.

PyDataset is usable with any backend. Nothing about it is TF-specific. Did you encounter an issue?

@LarsKue
Copy link
Contributor Author

LarsKue commented May 8, 2024

Interesting, for me, the import does not work under the torch backend. I thought this was the intended behaviour:

conda create -y -n test-py-tensorflow python=3.11
conda activate test-py-tensorflow
pip install -U tensorflow keras
conda env config vars set KERAS_BACKEND=tensorflow
conda deactivate
conda activate test-py-tensorflow
python -c "from keras.utils import PyDataset"

works fine, whereas

conda create -y -n test-py-torch python=3.11
conda activate test-py-torch
pip install -U keras
conda install pytorch torchvision torchaudio cpuonly -c pytorch
conda env config vars set KERAS_BACKEND=torch
conda deactivate
conda activate test-py-torch
python -c "from keras.utils import PyDataset"

yields

ImportError: cannot import name 'PyDataset' from 'keras.utils'

Edit: Fixed the minimal example

@fchollet
Copy link
Member

fchollet commented May 8, 2024

The import pattern should be from keras.utils import PyDataset or import keras; keras.utils.PyDataset.

@LarsKue
Copy link
Contributor Author

LarsKue commented May 8, 2024

Sorry, that was a mistake I put in when creating a minimal example.

The issue seems to be a little bit deeper. When I install keras using conda install keras, I get the above issue:

from keras.utils import PyDataset
>>> ImportError: cannot import name 'PyDataset' from 'keras.utils'

When I install with pip install -U keras (as recommended by keras), I instead now get:

import keras
>>> ModuleNotFoundError: No module named 'packaging'

@SuryanarayanaY
Copy link
Collaborator

SuryanarayanaY commented May 9, 2024

Hi @LarsKue ,

Please install the packaging module using pip install packaging and try again.

@SuryanarayanaY
Copy link
Collaborator

The library packaging is a dependency for Keras. Please check here.

packaging

@SuryanarayanaY SuryanarayanaY added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting response from contributor labels May 9, 2024
@LarsKue
Copy link
Contributor Author

LarsKue commented May 10, 2024

@SuryanarayanaY thanks, this fixes the issue. packaging should probably be auto-installed when running pip install -U keras though. Would also be nice to have the conda repos updated so that the latest torch-compatible keras version is newer than 3.1.0, which has the issue of the missing PyDataset.

@LarsKue LarsKue closed this as completed May 10, 2024
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

No branches or pull requests

3 participants