Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Not Utilized in TensorFlow Federated Training on Google Colab #4042

Open
ghaffarzadeh opened this issue Jul 9, 2023 · 1 comment
Open
Labels
bug Something isn't working

Comments

@ghaffarzadeh
Copy link

Describe the bug
I am currently working with TensorFlow Federated to train a model using federated subgradient descent. However, when running my code on Google Colab, I noticed that the GPU is not being utilized during the learning process, even though TensorFlow recognizes the GPU's presence. I would like to understand how to enable TensorFlow Federated to utilize the GPU on Google Colab.

Environment:

  • OS Platform and Distribution: Google Colab
  • Python package versions: TensorFlow Federated 0.60.0, TensorFlow 2.12.0
  • Python version: 3.10

Steps to reproduce:

  1. Set up a Google Colab environment with the specified versions of TensorFlow Federated and TensorFlow.
  2. Create an iterative process using tff.learning.algorithms.build_fed_sgd.
  3. Initialize the process with iterative_process.initialize().

Expected behavior
Previously, when I used TFF version 0.20.0 with Python version 3.9 in Google Colab, I observed that the GPU usage would increase during the learning process, resulting in faster model training. However, after upgrading the Python version to 3.10, I had to switch to TFF version 0.60.0 since TFF 0.20.0 is no longer compatible with Google Colab. In TFF 0.60.0, I noticed that the GPU usage does not increase when the learning process starts, leading to significantly longer training times for my models.

Additional context
I have already verified that the GPU is recognized by TensorFlow, and it was functioning properly with previous versions of TensorFlow Federated. The issue seems to be specific to TFF 0.60.0 in combination with Python 3.10 on Google Colab. Any insights or guidance on how to enable GPU utilization in this setup would be greatly appreciated.

@ghaffarzadeh ghaffarzadeh added the bug Something isn't working label Jul 9, 2023
@zcharles8
Copy link
Collaborator

Hi @ghaffarzadeh. Apologies I'm just looking at this now - we recently released a new version. Can you verify whether or not this uses GPUs?

Unfortunately, TFF has its own embedded TF runtime, so the fact that the GPU is recognized by TF does not necessarily imply it'll be recognized by TFF. We might need to look into this more deeply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants