Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting threads in TensorFlow #84

Open
rth opened this issue Jun 24, 2021 · 7 comments
Open

Limiting threads in TensorFlow #84

rth opened this issue Jun 24, 2021 · 7 comments

Comments

@rth
Copy link

rth commented Jun 24, 2021

As far as I can tell, limiting the number of threads in TensorFlow with threadpoolctl currently doesn't work.

For instance with the following minimal example with Tensorflow 2.5.0,
example.py

import tensorflow as tf
import numpy as np

from threadpoolctl import threadpool_limits

with threadpool_limits(limits=1):
    X = tf.constant(np.arange(0, 5000**2, dtype=np.int32), shape=(5000, 5000))
    
    tf.matmul(X, X)

running,

time python example.py

on a 64 cores CPU, produces,

real    0m3.781s
user    1m8.685s

so the user (CPU) time is still >> real run time, meaning that many CPU are used.

This becomes an issue if people run scikit-learn's GridSearchCV or cross_validate on a Keras or TensorFlow model, since it then results in CPU over-subscription. I'm surprised there are no more issues about it at scikit-learn.

Tensorflow also regrettably doesn't recognize any environment variables to limit the number of CPU cores either. The only way I found around it is to set the CPU affinity mask with taskset. But then again it wouldn't help for cross-validation for instance, since joblib would then need to set the affinity mask when creating new processes which is currently not supported.

Has anyone looked into this in the past by any chance?

@rth
Copy link
Author

rth commented Jun 24, 2021

On the contrary an example with Pytorch 1.9 works as expected,

import torch


from threadpoolctl import threadpool_limits

with threadpool_limits(limits=1):
    X = torch.randn(10000, 10000)
    torch.matmul(X, X

@jeremiedbb
Copy link
Collaborator

TF relies on MKL for this operation, right ?
If they have their own tool to detect the number of cpu cores and explicitely set it to be the number of threads for MKL it will have priority over threadpoolctl and there's nothing we can do about it. It would also explain why the env var doesn't work. But I really don't know what they actually do in TF so maybe there's another reason.

@rth
Copy link
Author

rth commented Jul 9, 2021

They have multiple build options https://www.tensorflow.org/install/source#optimizations . One is to use https://github.com/oneapi-src/oneDNN which is part of MKL and also seems to support various threading runtimes https://github.com/oneapi-src/oneDNN#linux

@ogrisel
Copy link
Contributor

ogrisel commented Aug 3, 2021

What is the output?

python -m threadpoolctl --import tensorflow

@rth
Copy link
Author

rth commented Aug 3, 2021

The output is,

[
  {
    "filepath": <...>/lib/python3.8/site-packages/numpy.libs/libopenblasp-r0-09e95953.3.13.so",
    "prefix": "libopenblas",
    "user_api": "blas",
    "internal_api": "openblas",
    "version": "0.3.13",
    "num_threads": 24,
    "threading_layer": "pthreads"
  },
  {
    "filepath": "<...>/lib/python3.8/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so",
    "prefix": "libopenblas",
    "user_api": "blas",
    "internal_api": "openblas",
    "version": "0.3.9",
    "num_threads": 24,
    "threading_layer": "pthreads"
  }
]

so I imagine it's additionally using some other thread system that's not being detected.

@ogrisel
Copy link
Contributor

ogrisel commented Aug 3, 2021

This is the openblas used by NumPy and SciPy. Probably that tensorflow is using a linear algebra library of its own (e.g. Eigen?) and its threading layer is not handled by threadpoolctl.

@ogrisel
Copy link
Contributor

ogrisel commented Aug 3, 2021

Also, threadpoolctl cannot detect statically linked libraries, only dynamically linked libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants