Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL support #22

Closed
outlace opened this issue Nov 9, 2015 · 542 comments
Closed

OpenCL support #22

outlace opened this issue Nov 9, 2015 · 542 comments
Labels
stat:contribution welcome Status - Contributions welcome

Comments

@outlace
Copy link

outlace commented Nov 9, 2015

I understand TensorFlow only supports CUDA. What would need to be done to add in OpenCL support?

@nmabhinandan
Copy link

It's strange that Google ditched open OpenCL for proprietary CUDA.
im-just-saying

@ebrevdo
Copy link
Contributor

ebrevdo commented Nov 9, 2015

At the very least, the Eigen library would have to support OpenCL.

@bhack
Copy link
Contributor

bhack commented Nov 9, 2015

👍

@keveman keveman added the cuda label Nov 9, 2015
@jamesliu96
Copy link

👍

1 similar comment
@alexatknit
Copy link

👍

@dhess
Copy link

dhess commented Nov 11, 2015

thumbs up and all that.

@gujunli
Copy link

gujunli commented Nov 11, 2015

I will be interested in expanding Tensor Flow with OpenCL. As we have already released OpenCL caffe. https://github.com/amd/OpenCL-caffe. Hopefully it can get integrated in light way? Is anyone interested in working together on this?

@bhack
Copy link
Contributor

bhack commented Nov 11, 2015

@gujunli Nice to see AMD here. /cc @naibaf7 @lunochod

@nmabhinandan
Copy link

would be great.

@sasadep
Copy link

sasadep commented Nov 11, 2015

👍

@bhack
Copy link
Contributor

bhack commented Nov 15, 2015

/cc @lukeiwanski for Eigen/OpenCL/SYCL

@ankdesh
Copy link

ankdesh commented Nov 16, 2015

@gujunli Certainly would be interested in contributing. Please let me know when you plan to start.

@lukeiwanski
Copy link

Hi all,

Here at Codeplay we are looking into Eigen's tensor running on GPU using SYCL (a modern C++ layer on top of OpenCL). From what we have gathered so far, GPU tensor design is very closely coupled with CUDA and it will require interface changes for another programming model and particularly a SYCL and OpenCL 1.2 version.

If anyone is interested in digging deeper / helping out, we are most certainly interested in contributing.

Thanks,
Luke

@bhack
Copy link
Contributor

bhack commented Nov 25, 2015

@lukeiwanski Thank you for the feedback. I think that @benoitsteiner worked at the tensor extension part of eigen.

@jszuppe
Copy link

jszuppe commented Dec 6, 2015

👍 I can help code some OpenCL/SYCL if someone makes a plan, divides work into tasks etc. I recommend using Boost.Compute as a wrapper for OpenCL (it makes running kernels, testing, templating easier).

@ieee8023
Copy link

ieee8023 commented Dec 7, 2015

+1

1 similar comment
@armish
Copy link

armish commented Dec 7, 2015

👍

@lukeiwanski
Copy link

Hi all,

Just to keep you posted, we are still investigating how we can change the Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach that targets heterogeneous programming models ( not only OpenCL / SYCL ) we will create a proposal.

Thanks,
Luke

@gujunli
Copy link

gujunli commented Dec 8, 2015

Pls keep me update. I developed opencl-caffe for AMD. I am also looking at
tensor flow.

Thanks.
Junlu
On Dec 8, 2015 10:19 AM, "Luke Iwanski" notifications@github.com wrote:

Hi all,

Just to keep you posted, we are still investigating how we can change the
Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach we will create a proposal.

Thanks,
Luke


Reply to this email directly or view it on GitHub
#22 (comment)
.

@bhack
Copy link
Contributor

bhack commented Dec 9, 2015

/cc @ptillet @gongzg Is there any interest in this by Intel? I really hope that we don't fragment OPENCL here like in Caffe where we have an AMD fork, Intel unmerged PRs, another semi-unofficial AMD PR, and a long staging user PR (plus two old abandoned Opencl efforts). If somebody is interested in the history can take a look at BVLC/caffe#2610 comments.

@martinwicke martinwicke added the stat:contribution welcome Status - Contributions welcome label Dec 16, 2015
@gongzg
Copy link

gongzg commented Dec 17, 2015

@bhack We do have interest in this. Thanks for letting me know. If there is a proposal for Eigen's OpenCL/SYCL implementation, we will see what we can do from Intel side.

@benoitsteiner benoitsteiner self-assigned this Dec 23, 2015
@ZirconCode
Copy link

👍

@bhack
Copy link
Contributor

bhack commented Jan 1, 2016

An interesting initiative at https://github.com/ptillet/isaac also if here we rely on Eigen tensor extension.

@DanMcLaughlin
Copy link

I also would like to contribute. @benoitsteiner can you organize it?

@FelixSchwarz
Copy link

FelixSchwarz commented Aug 27, 2018

I think this thread is mostly meaningless for developers (too much noise - and I'll add some more ;-) but I think many comments are missing the point:
If you want to run Tensorflow with AMD cards OpenCL IS NOT what you are looking for - please head over to https://github.com/ROCmSoftwarePlatform/ and install the ROCm stack. AFAIK AMD's current strategy is based on ROCm instead of OpenCL for Tensorflow/pytorch.

Generic OpenCL was too much maintenance/did not give enough performance benefits to be worthwhile for AMD. Therefore this ticket is only interesting if you are running (e.g.) an ARM platform which uses OpenCL only.

(Disclaimer: just an outsider, no real inside into Tensorflow development so maybe the information above completely wrong and misleading. Feel free to bash me if you know better.)

@ghost
Copy link

ghost commented Aug 27, 2018

Just a thought, what about llvm with the new GPU offload? That would put a great level of abstraction between tensorflow and cuda specific code.

@mirh
Copy link

mirh commented Aug 27, 2018

What about all of you reading just 10 posts above and noticing there already is a fork by lukeiwanski/codeplaysoftware you can try ?
(also my hats off to xiaomi for, once, contributing some serious kind of open source effort)

@fantesykikachu
Copy link

@FelixSchwarz Just so you are aware ROCm uses OpenCL, it is AMD's userspace OpenCL driver on Linux (that is why is why it doesn't support windows), so if you are not aware of how AMD's driver ecosystem on linux works, they have their kernel side drivers AMDGPU and AMDKFD(which is now getting merged into AMDGPU) then there is the userspace drivers RadeonSI(for OpenGL) RadV/AMDVLK(for Vulkan) and ROCm(for OpenCL).

@XVilka
Copy link

XVilka commented Sep 15, 2018

Judging by the dynamics of this bug and other forks Google has zero interest in this and will never implement this in the official repository. I would vote for closing this issue (or locking it) at all to not give any false hopes for everyone.

@klokik
Copy link

klokik commented Sep 15, 2018 via email

@tamusjroyce
Copy link

tamusjroyce commented Nov 28, 2018

There is a TensorRT that supports Movidius Pi Hat. And that Movidius Pi Hat is Google’s $45 “AIY Vision Kit”. Google links to Target to buy it.

This doesn't have any ties to CUDA or Nvidia? Says it uses an Intel chip. At its heart, maybe the chip is a FPGA? Anyone know anything more about it?

@znmeb
Copy link

znmeb commented Nov 28, 2018

I know quite a bit about the big Movidius unit - it's inference only and it runs either TensorFlow or Caffe pre-compiled models. IIRC they're all in 16 bit mode.

The Movidius chip itself is much more powerful but you have to be a qualified partner to get the SDK.

@talregev
Copy link

talregev commented Dec 11, 2018

Have some link for other that try to have tensor opencl:

https://github.com/hughperkins/tf-coriander
https://github.com/ChiahungTai/tensorflow-cl
https://github.com/guoyejun/tensorflow-cl
https://github.com/honggui/tensorflow-cl
https://github.com/benoitsteiner/tensorflow-opencl
https://github.com/lukeiwanski/tensorflow (repository is out of date)
https://github.com/codeplaysoftware/tensorflow
Maybe worth to check also:

https://documen.tician.de/pyopencl/
https://pypi.org/project/DeepCL/
https://www.khronos.org/sycl/

Feel free add working projects.

@filips123
Copy link

Is there any update? This issue is over 3 years old.

@mirh
Copy link

mirh commented Jan 9, 2019

YES THERE IS JUST LOOK AT THE LAST HANDFUL OF POSTS.

@XVilka
Copy link

XVilka commented Jan 10, 2019

@filips123 no, there are no updates and will never be in any foreseeable future - probability of that is lower than of alien invasion and finding a way to travel back in time.

@lppier
Copy link

lppier commented Jan 10, 2019

This intel initiative PlaidML works reasonably well, worth checking it out.
https://github.com/plaidml/plaidml
It runs on opencl OR metal on mac. It works with Macbook Pro AMD gpus, which is what I was looking for.
Meanwhile, could you guys help vote for Pytorch support in PlaidML? plaidml/plaidml#63

@mirh
Copy link

mirh commented Jan 10, 2019

PlaidML is certainly all nice and dandy (I, for one, somehow could get more performance on an nvidia gpu on opencl than with tf's cuda itself)..
But it's a backend for keras? In complete replacement to tensorflow, which you know, it's the repo we are discussing this in?
(for as much as I seem to understand latest tf versions can export models directly to keras? so there's that..)

Anyway, for the fourth damn time, if you want a recent solution on opencl and something still being actively developed (and also the thing with the actual chances to be merged here for real one day), there's just codeplay stack.
Again:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

@lppier
Copy link

lppier commented Jan 11, 2019

PlaidML is certainly all nice and dandy (I, for one, somehow could get more performance on an nvidia gpu on opencl than with tf's cuda itself)..
But it's a backend for keras? In complete replacement to tensorflow, which you know, it's the repo we are discussing this in?
(for as much as I seem to understand latest tf versions can export models directly to keras? so there's that..)

Anyway, for the fourth damn time, if you want a recent solution on opencl and something still being actively developed (and also the thing with the actual chances to be merged here for real one day), there's just codeplay stack.
Again:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

My apologies, I had not realised there was no tensorflow support. My assuming brain thought that keras gpu support == tensorflow support.

@iperov
Copy link

iperov commented Feb 18, 2019

plaidML is super cool. Works on keras.
Of course I had to transfer some tf code to pure keras in order to work on plaidML backend (for example tf.image.ssim)
But result - my code works on NVIDIA and AMD cards.

Also plaidML is heaven for researchers. It automatically generates gradient for any function you will write on "Tile" language and it will work on your GPU with 80% speed of tensorflow.

So I cannot understand why ML researchers still using PyTorch ? Let's boost ML science with Intel's plaidML ?

@Degerz
Copy link

Degerz commented Feb 26, 2019

@iperov Care to know why practically no one uses PlaidML ?

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it. Performance so bad that using Tensorflow with CPUs is competitive or even outright beats their hardware using PlaidML ?

  2. Nobody is interested in maintaining their specialized Tile programming language in which only someone like a pure maths professor would concoct so PlaidML's code quality just goes down the drain and no serious programmers in their right mind would want to deal with overly clever code ...

  3. This pretty much ties into OSX Yosemite "can't determine number of CPU cores: assuming 4" #2 but ever since Intel bought out Vertex.AI, they don't care about PlaidML anymore. Intel's solution for GPU compute accelerated machine learning is introducing a new compiler specifically for deep learning now known as nGraph to target Tensorflow, PyTorch or other deep learning frameworks as a backend for them. No reason for them to keep developing PlaidML anymore as their intermediary when they have nGraph ...

People use PyTorch for other reasons such as maintainability or other features so to sum it up PlaidML is Intel's tool and they probably don't intend for it to play in any role of the final parts of their plans. nGraph's current Intel GPU backend is based off of OpenCL 2.1 of which only Intel has a conformant implementation so Intel only exists to look out for themselves rather than purely for the betterment of machine learning. When Intel goes on to further developing nGraph, I can't see them continue basing off their GPU backend on OpenCL 2.1 alone since many deep learning frameworks have templated kernels which are not compatible with OpenCL, Metal or Vulkan's separate source programming models so it's probably only for experimentation purposes. Intel's final GPU backend is probably going to either be based off of SYCL 2.2 or something else entirely different like OpenMP and maybe they'll even bring a vendor specific solution ...

As for AMD, who cares ? OpenCL is irrelevant to them and they're finally showing some results with their work on HIP ...

@talregev
Copy link

@iperov Care to know why practically no one uses PlaidML ?

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it. Performance so bad that using Tensorflow with CPUs is competitive or even outright beats their hardware using PlaidML ?
  2. Nobody is interested in maintaining their specialized Tile programming language in which only someone like a pure maths professor would concoct so PlaidML's code quality just goes down the drain and no serious programmers in their right mind would want to deal with overly clever code ...
  3. This pretty much ties into OSX Yosemite "can't determine number of CPU cores: assuming 4" #2 but ever since Intel bought out Vertex.AI, they don't care about PlaidML anymore. Intel's solution for GPU compute accelerated machine learning is introducing a new compiler specifically for deep learning now known as nGraph to target Tensorflow, PyTorch or other deep learning frameworks as a backend for them. No reason for them to keep developing PlaidML anymore as their intermediary when they have nGraph ...

People use PyTorch for other reasons such as maintainability or other features so to sum it up PlaidML is Intel's tool and they probably don't intend for it to play in any role of the final parts of their plans. nGraph's current Intel GPU backend is based off of OpenCL 2.1 of which only Intel has a conformant implementation so Intel only exists to look out for themselves rather than purely for the betterment of machine learning. When Intel goes on to further developing nGraph, I can't see them continue basing off their GPU backend on OpenCL 2.1 alone since many deep learning frameworks have templated kernels which are not compatible with OpenCL, Metal or Vulkan's separate source programming models so it's probably only for experimentation purposes. Intel's final GPU backend is probably going to either be based off of SYCL 2.2 or something else entirely different like OpenMP and maybe they'll even bring a vendor specific solution ...

As for AMD, who cares ? OpenCL is irrelevant to them and they're finally showing some results with their work on HIP ...

What about all GPU inside arm machine like mobile phones and raspberry pi odroid and etc?
They don't support opencl?
Google should care about insert tensorflow on gpu on android.
The biggest libraries of neural network training run only on Nvidia gpu, it just make Nvidia gpu more and more expensive (because it people and companies only buy it for professional neural network training), then google will lose more money that way.

@iperov
Copy link

iperov commented Feb 26, 2019

@Degerz from which planet you are came from?
How you can compare tf-CPU and AMD GPU ?
AMD GPU on plaidML x30 faster than tf-CPU

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it

in my deepfakes tests OpenCL slower only by 20%, but in some mini networks OpenCL is 20% FASTER.

My project DeepFaceLab has many users that have been waiting for the support of AMD. How many people were delighted when deepfakes can finally be trained on AMD cards.
Also plaidML is the only backend for keras that supports AMD/IntelHD out of the box.
If a new AMD backend for keras appears, of course my project will switch to it.
PyTorch has no future.

What to maintain in plaidML ? Ops are auto differentiable, there is nothing to maintain.

Tile programming language in which only someone like a pure maths professor would concoct

Machine learning is invented by professors of mathematics, isn't it?

@Degerz
Copy link

Degerz commented Feb 26, 2019

@talregev What about ARM or Broadcom ? The former probably has subpar OpenCL implementation and the latter doesn't even officially provide OpenCL drivers! It's not Google's responsibility to create and maintain a competent compute stack for hardware vendors ...

@iperov You realize that training neural nets with embedding layers on PlaidML is painful, right ? PlaidML also has a bunch of other limitations as well such as not being all that well suited for DenseNets or the fact that it's computation graphs are static and does PlaidML even work well with RNNs ?

As for your project, don't worry about it. You'll move on to something better like Tensorflow since AMD will soon offer a native GPU backend for it once MIOpen gets upstreamed which is their GPU accelerated library of primitives for deep neural networks similar to their competitor's cuDNN library both of which will leave PlaidML in the dust in terms of performance. Who cares about Intel iGPUs anyway ? If Intel is truly committed to delivering high performance deep learning on their future discrete graphics hardware then they'll offer a single source option just like the others (AMD/HIP and Nvidia/CUDA) did before them ...

PyTorch has no future.

Envy much ? PyTorch is ~10x more popular than PlaidML, newest techniques in DL are implemented easily on PyTorch, tons of different contributors and is actively developed by Facebook all the while Intel hasn't contributed to PlaidML in nearly a month ?

What to maintain in plaidML ? Ops are auto differentiable, there is nothing to maintain.

So I take it from you that PlaidML shouldn't receive any new fixes or new features in the future going forward ? If you don't see the value in improving code then there's no point in convincing you to acknowledge PlaidML's glaring flaws ...

Machine learning is invented by professors of mathematics, isn't it?

Doesn't mean we have to take up whatever programming language they make up especially in the case of Tile where elegance is clearly favoured over readability. It's no wonder why so many potential contributors are scared away from contributing ...

@unoexperto
Copy link

Jesus, I wish you guys STFU and get back to work instead. I'll have to unsubscribe from the ticket because it's unbearable to get emails with flame wars. Too bad maintainers do not mute the thread.

@gunan @caisq @sanjoy Could you please do something about it ?

@tensorflow tensorflow locked as too heated and limited conversation to collaborators Feb 26, 2019
@rthadur rthadur removed the cuda label Mar 19, 2019
@rthadur
Copy link
Contributor

rthadur commented Jun 24, 2021

Closing the issue , please refer here https://blog.tensorflow.org/2020/08/faster-mobile-gpu-inference-with-opencl.html

@rthadur rthadur closed this as completed Jun 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stat:contribution welcome Status - Contributions welcome
Projects
None yet
Development

No branches or pull requests