Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to CuDNN 7 and CUDA 9 #12052

Closed
tpankaj opened this issue Aug 4, 2017 · 170 comments
Closed

Upgrade to CuDNN 7 and CUDA 9 #12052

tpankaj opened this issue Aug 4, 2017 · 170 comments
Assignees
Labels
type:feature Feature requests

Comments

@tpankaj
Copy link

tpankaj commented Aug 4, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows Server 2012
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.3.0-rc1
  • Python version: 3.5.2
  • Bazel version (if compiling from source): N/A
  • CUDA/cuDNN version: CUDA V8.0.44, CuDNN 6.0
  • GPU model and memory: Nvidia GeForce GTX 1080 Ti, 11 GB
  • Exact command to reproduce: N/A

Describe the problem

Please upgrade TensorFlow to support CUDA 9 and CuDNN 7. Nvidia claims this will provide a 2x performance boost on Pascal GPUs.

@shivaniag shivaniag added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests labels Aug 4, 2017
@shivaniag
Copy link
Contributor

@tfboyd do you have any comments on this?

@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017

cuDNN 7 is still in preview mode and is being worked on. We just moved to cuDNN 6.0 with 1.3, which should go final in a couple weeks. You can download cuDNN 1.3.0rc2 if you are interested in that. I have not compiled with cuDNN 7 or CUDA 9 yet. I have heard CUDA 9 is not easy to install on all platforms and only select install packages are available. When the libraries are final we will start the final evaluation. NVIDIA has also just started sending patches to the major ML platforms to support aspects of these new libraries and I suspect there will be additional work.

Edit: I meant to say CUDA 9 is not easy to install on all platforms and instead said cuDNN. I also changed sure there will be work to I suspect there will be additional work. The rest of my silly statement I left, e.g. I did not realize cuDNN 7 went live yesterday.

@tfboyd tfboyd self-assigned this Aug 5, 2017
@tfboyd tfboyd removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 5, 2017
@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017

Not saying how you should read the website. But the 2x faster on pascal looks to be part of the CUDA 8 release. I suppose it depends on how you read the site. NVIDIA has not mentioned to us that CUDA 9 is going to speed up Pascal by 2x (on everything) and while anything is possible, I would not expect that to happen.

https://developer.nvidia.com/cuda-toolkit/whatsnew

The site is a little confusing but I think the section you are quoting is nested under the CUDA 8. I only mention this so you do not have unrealistic expectations for their release. For Volta there should be some great gains from what I understand and I think (I do not now for sure) people are just getting engineering samples of Volta to start high level work to get ready for the full release.

@sclarkson
Copy link
Contributor

@tfboyd cuDNN 7 is no longer in preview mode as of yesterday. It has been officially released for both CUDA 8.0 and CUDA 9.0 RC.

@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017

Ahh I missed that. Thanks @sclarkson and sorry for the wrong info.

@theflofly
Copy link
Contributor

I will certainly try it because finally gcc 6 is supported by CUDA 9 and Ubuntu 17.04 comes with it.

@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017 via email

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Aug 5, 2017

Speaking of methods to be added, group convolution from cudnn7 would be a important feature for vision community.

@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017 via email

@tfboyd
Copy link
Member

tfboyd commented Aug 5, 2017 via email

@4F2E4A2E
Copy link
Contributor

4F2E4A2E commented Aug 6, 2017

I am trying to get cuDNN 7 with CUDA 8/9 running. CUDA 8 is not supported by the GTX 1080 Ti - at least the installer says so ^^

I am having a big time trouble getting it running together. I want to point out this great article that sums up what i already have tried: https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html

The CUDA examples are working via Visual-Studio in both setup combinations.
Here the output of the deviceQuery.exe which was compiled using Visual-Studio:

PS C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release> deviceQuery.exe
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release\deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11264 MBytes (11811160064 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1683 MHz (1.68 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1, Device0 = GeForce GTX 1080 Ti
Result = PASS

@tfboyd do you have any link confirming the cuDNN update from Nvidea?

@tpankaj
Copy link
Author

tpankaj commented Aug 7, 2017

@4F2E4A2E 1080 Ti definitely supports CUDA 8.0. That's what I've been using with TensorFlow for the past several months.

@colmantse
Copy link

Hi all, so i have gtx 1080 ti with cuda 8.0. I am trying to install tensorflow-gpu, do i go for cuDNN 5.1, 6.0 or 7.0?

@tfboyd
Copy link
Member

tfboyd commented Aug 7, 2017 via email

@colmantse
Copy link

thanks, i tried with cudnn 6.0 but doesn't work, i guess because of my dummy tf-gpu installation. cudnn 5.1 works for me with python 3.6

@4F2E4A2E
Copy link
Contributor

4F2E4A2E commented Aug 8, 2017

@tpankaj Thank you! I've got it running with CUDA 8 and cuDNN 5.1

@cancan101
Copy link
Contributor

Here are the full set of features in cuDNN 7:

Key Features and Enhancements
This cuDNN release includes the following key features and enhancements.
Tensor Cores
Version 7.0.1 of cuDNN is the first to support the Tensor Core operations in its
implementation. Tensor Cores provide highly optimized matrix multiplication
building blocks that do not have an equivalent numerical behavior in the traditional
instructions, therefore, its numerical behavior is slightly different.
cudnnSetConvolutionMathType, cudnnSetRNNMatrixMathType, and
cudnnMathType_t
The cudnnSetConvolutionMathType and cudnnSetRNNMatrixMathType
functions enable you to choose whether or not to use Tensor Core operations in
the convolution and RNN layers respectively by setting the math mode to either
CUDNN_TENSOR_OP_MATH or CUDNN_DEFAULT_MATH.
Tensor Core operations perform parallel floating point accumulation of multiple
floating point products.
Setting the math mode to CUDNN_TENSOR_OP_MATH indicates that the library will use
Tensor Core operations.
The default is CUDNN_DEFAULT_MATH. This default indicates that the Tensor Core
operations will be avoided by the library. The default mode is a serialized operation
whereas, the Tensor Core is a parallelized operation, therefore, the two might result
in slightly different numerical results due to the different sequencing of operations.
The library falls back to the default math mode when Tensor Core operations are
not supported or not permitted.
cudnnSetConvolutionGroupCount
A new interface that allows applications to perform convolution groups in the
convolution layers in a single API call.
cudnnCTCLoss
cudnnCTCLoss provides a GPU implementation of the Connectionist Temporal
Classification (CTC) loss function for RNNs. The CTC loss function is used for
phoneme recognition in speech and handwriting recognition.
CUDNN_BATCHNORM_SPATIAL_PERSISTENT
The CUDNN_BATCHNORM_SPATIAL_PERSISTENT function is a new batch
normalization mode for cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward. This mode is similar to
CUDNN_BATCHNORM_SPATIAL, however, it can be faster for some tasks.
cudnnQueryRuntimeError
The cudnnQueryRuntimeError function reports error codes written by GPU
kernels when executing cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward with the
CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode.
cudnnGetConvolutionForwardAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionForwardAlgorithm.
cudnnGetConvolutionBackwardDataAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardAlgorithm.
cudnnGetConvolutionBackwardFilterAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardFilterAlgorithm.
CUDNN_REDUCE_TENSOR_MUL_NO_ZEROS
The MUL_NO_ZEROS function is a multiplication reduction that ignores zeros in the
data.
CUDNN_OP_TENSOR_NOT
The OP_TENSOR_NOT function is a unary operation that takes the negative of
(alpha*A).
cudnnGetDropoutDescriptor
The cudnnGetDropoutDescriptor function allows applications to get dropout
values.

@tfboyd
Copy link
Member

tfboyd commented Aug 11, 2017

Alright I am thinking about starting a new issue that is more of a "blog" of CUDA 9 RC + cuDNN 7.0. I have a TF build "in my hand" that is patched together but is CUDA 9RC and cuDNN 7.0 and I want to see if anyone is interesting in trying it. I also need to make sure there is not some weird reason why I cannot share it. There are changes that need to be made to some upstream libraries that TensorFlow uses but you will start to see PRs coming in from NVIDIA in the near future. I and the team were able to test CUDA 8 + cuDNN 6 on Volta and then CUDA 9RC + cuDNN 7 on Volta (V100) with FP32 code. I only do Linux builds and Python 2.7 but if all/any of you are interested I would like to try and involve the community more than we did with cuDNN 6.0. It might not be super fun but I want to offer as well as try to make this feel more like we are in this together vs. I am feeing information. I also still want to build out lists of what features we are working on but not promising for cuDNN 7 (and 6.0). @cancan101 thank you for the full list.

@Froskekongen
Copy link

@tfboyd: I would be grateful for descriptions on doing CUDA 9.0RC+cuDNN 7.0. I am using a weird system myself (ubuntu 17.10 beta with TF1.3, CUDA 8.0 and cuDNN 6.0 gcc-4.8), and upgrading to cuda 9 and cudnn 7 would actually be nice compilerwise.

@tfboyd
Copy link
Member

tfboyd commented Aug 11, 2017 via email

@theflofly
Copy link
Contributor

@tfboyd: I am interested, how will you share it? A branch?

@tanmayb123
Copy link

@tfboyd I'd definitely be very interested as well. Thanks!

@tfboyd
Copy link
Member

tfboyd commented Aug 14, 2017 via email

@4F2E4A2E
Copy link
Contributor

The pr is declined but it seams to be merged manually. Do we have to wait for a eigen release or is it getting built by the sources?

@hadaev8
Copy link

hadaev8 commented Dec 15, 2017

Cool, then it will be on Nightly pip?

@nasergh
Copy link

nasergh commented Dec 16, 2017

@Tweakmind
i try to rebuild tensor with using python 2.7
but in bazel build i get this error
i also install numpy but no change.

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
ERROR: /home/gh2/Downloads/tensorflow/util/python/BUILD:5:1: no such package '@local_config_python//': Traceback (most recent call last):
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 310
		_create_local_python_repository(repository_ctx)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 274, in _create_local_python_repository
		_get_numpy_include(repository_ctx, python_bin)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 257, in _get_numpy_include
		_execute(repository_ctx, [python_bin, "-c",..."], <2 more arguments>)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 76, in _execute
		_python_configure_fail("\n".join([error_msg.strip() if ... ""]))
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 37, in _python_configure_fail
		fail(("%sPython Configuration Error:%...)))
Python Configuration Error: Problem getting numpy include path.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
**ImportError: No module named numpy**
Is numpy installed?
 and referenced by '//util/python:python_headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Loading failed
INFO: Elapsed time: 10.826s
FAILED: Build did NOT complete successfully (26 packages loaded)
    currently loading: tensorflow/core ... (3 packages)
    Fetching http://mirror.bazel.build/.../~ooura/fft.tgz; 20,338b 5s
    Fetching http://mirror.bazel.build/zlib.net/zlib-1.2.8.tar.gz; 19,924b 5s
    Fetching http://mirror.bazel.build/.../giflib-5.1.4.tar.gz; 18,883b 5s

@masasys
Copy link

masasys commented Dec 16, 2017

It seems that OSX is excluded in version 7.0.5 of cuDNN. Does anyone know a detailed thing?

@eeilon79
Copy link

eeilon79 commented Dec 16, 2017

I still can't get tensorflow-gpu to work in Windows 10 (with CUDA 9.0.176 and cudnn 7.0).
I've uninstalled both tensorflow and tensorflow-gpu and reinstalled them (with the --no-cache-dir to ensure downloading of most recent version with the eigen workaround). When I install both, my GPU is not recognized:

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'random_uniform_1/sub': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.

When I install just tensorflow-gpu it complains about a missing dll:

ImportError: Could not find 'cudart64_80.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 8.0 from this URL: https://developer.nvidia.com/cuda-toolkit

Which is weird because my CUDA version is 9.0, not 8.0, and is recognized (deviceQuery test passed).
My python version is 3.6.3. I'm trying to run this code in Spyder (3.2.4) in order to test tensorflow-gpu.
What did I miss?

@hadaev8
Copy link

hadaev8 commented Dec 16, 2017

I'm trying to build from source by bazel on win 7, get error

No toolcahin for cpu 'x64_windows'

Can anyone build whl?

@Tweakmind
Copy link

@hadaev8, I need a lot more information to help. I can work on a whl but it will have heavy dependencies and not Win7, once I solve MacOS, I will solve Win10. In any case, post your details.

@eeilon79, I need to recreate this under Win10. I'm currently focused on MacOS now that Ubuntu is solved. I will come back to Win 10.

@Tweakmind
Copy link

@nasergh, is there a requirement for python 2.7?

@Tweakmind
Copy link

With CUDA 8.0 and cuDNN 6.0, this is how I installed TensorFlow from source for Cuda GPU and AVX2 support in Win10::

Requirements:

* Windows 10 64-Bit
* Visual Studio 15 C++ Tools
* NVIDIA CUDA® Toolkit 8.0
* NVIDIA cuDNN 6.0 for CUDA 8.0
* Cmake
* Swig

Install Visual Studio Community Edition Update 3 w/Windows Kit 10.0.10240.0
Follow instructions at: https://github.com/philferriere/dlwin (Thank you Phil)

Create a Virtual Drive N: for clarity
I suggest creating a directory off C: or your drive of choice and creating N: based on these instructions (2GB min):
https://technet.microsoft.com/en-us/library/gg318052(v=ws.10).aspx

Install Cuda 8.0 64-bit
https://developer.nvidia.com/cuda-downloads (Scroll down to Legacy)

Install cuDNN 6.0 for Cuda 8.0
https://developer.nvidia.com/rdp/cudnn-download
Put cuda folder from zip on N:\and rename cuDNN-6

Install CMake
https://cmake.org/files/v3.10/cmake-3.10.0-rc5-win64-x64.msi

Install Swig (swigwin-3.0.12)
https://sourceforge.net/projects/swig/files/swigwin/swigwin-3.0.12/swigwin-3.0.12.zip

cntk-py36

activate cntk-py36
pip install https://cntk.ai/PythonWheel/GPU/cntk-2.2-cp36-cp36m-win_amd64.whl
python -c "import cntk; print(cntk.__version__)"
conda install pygpu
pip install keras

Remove old tensorflow in Tools if it exists

move tensorflow tensorflow.not
git clone --recursive https://github.com/tensorflow/tensorflow.git
cd C:\Users\%USERNAME%\Tools\tensorflow\tensorflow\contrib\cmake
Edit CMakeLists.txt

Comment out these:

# if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
#   include(CheckCXXCompilerFlag)
#   CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#   if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#     set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
#   endif()
# endif()

Add these:

if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
  include(CheckCXXCompilerFlag)
  CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
  if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
  else()
    CHECK_CXX_COMPILER_FLAG("/arch:AVX2" COMPILER_OPT_ARCH_AVX_SUPPORTED)
    if(COMPILER_OPT_ARCH_AVX_SUPPORTED)
      set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
    endif()
  endif()
endif()

mkdir build & cd build

"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat"

cmake .. -A x64 -DCMAKE_BUILD_TYPE=Release ^
-DSWIG_EXECUTABLE=N:/swigwin-3.0.12/swig.exe ^
-DPYTHON_EXECUTABLE=N:/Anaconda3/python.exe ^
-DPYTHON_LIBRARIES=N:/Anaconda3/libs/python36.lib ^
-Dtensorflow_ENABLE_GPU=ON ^
-DCUDNN_HOME="n:\cuDNN-6" ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS=/arch:AVX2

-- Building for: Visual Studio 14 2015
-- Selecting Windows SDK version 10.0.14393.0 to target Windows 10.0.16299.
-- The C compiler identification is MSVC 19.0.24225.1
-- The CXX compiler identification is MSVC 19.0.24225.1
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED - Failed
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED - Success
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED - Success
-- Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0 (found suitable version "8.0", minimum required is "8.0")
-- Found PythonInterp: C:/Users/%USERNAME%/Anaconda3/python.exe (found version "3.6.3")
-- Found PythonLibs: C:/Users/%USERNAME%/Anaconda3/libs/python36.lib (found version "3.6.3")
-- Found SWIG: C:/Users/%USERNAME%/Tools/swigwin-3.0.12/swig.exe (found version "3.0.12")
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/%USERNAME%/Tools/tensorflow/tensorflow/contrib/cmake/build

MSBuild /p:Configuration=Release tf_python_build_pip_package.vcxproj

@hadaev8
Copy link

hadaev8 commented Dec 16, 2017

@Tweakmind
python 3.6, tensorflow last from master, cuda 9.0, cudnn 7.0.5 for cuda 9.0, basel and swig loaded today.

@argman
Copy link

argman commented Dec 17, 2017

@Tweakmind do you build with master or ?

@hadaev8
Copy link

hadaev8 commented Dec 17, 2017

@Tweakmind
May you build on windows with cuda 9 cudnn 7 and share .whl?

@alc5978
Copy link

alc5978 commented Dec 21, 2017

@Tweakmind

Don't you try to build on win 10 with cuda 9 cudnn 7 ?

Thanks for your expertise !

@whatever1983
Copy link

whatever1983 commented Dec 24, 2017

@hadaev8 @alc5978
pip install -U tf-nightly-gpu now gives a win10 build dated 2017122, which is based on TF 1.5 beta with CUDA 9.0 and CuDNN 7.0.5. I ran it last night, it is ok. Now we should move onto CUDA 9.1 for the 12x CUDA kernel launch speed. Tensorflow windows support is pretty slow and anemic. Stable official builds should be offered ASAP. I am actually for Tensorflow 1.5 stable to be released with CUDA 9.1, by the end of January please?

@arunmandal53
Copy link

arunmandal53 commented Jan 1, 2018

Go to http://www.python36.com/install-tensorflow141-gpu/ for step by step installation of tensorflow with cuda 9.1 and cudnn7.05 on ubuntu. And go to http://www.python36.com/install-tensorflow-gpu-windows for step by step installation of tensorflow with cuda 9.1 and cudnn 7.0.5 on Windows.

@tonmoyborah
Copy link

It's 2018, almost end of January and installation of TF with CUDA9.1 and CuDNN7 on Windows 10 is still not done?

@tfboyd
Copy link
Member

tfboyd commented Jan 24, 2018

1.5 is RC with CUDA 9 + cuDNN 7 and should go GA in the next few days. (CUDA 9.1 was GA in December and requires another device driver upgrade that is disruptive to many users. The current plan is to keep the default build on CUDA 9.0.x and keep upgrading to newer cuDNN versions).

I opened an issue to discuss CUDA 9.1.

The 12x kernel launch speed improvement is more nuanced than the 12x number. The top end of 12x is for ops with a lot of arguments and the disruption to users is high due to the device driver upgrade. I hope to have a "channel" testing 9.1 in the near future and figure out how to deal with this paradigm.

@ViktorM
Copy link

ViktorM commented Jan 24, 2018

I hope it will be finally CUDA 9.1, not 9.0.

@Magicfeng007
Copy link

I hope it will be finally CUDA 9.1, not 9.0 too.

@alc5978
Copy link

alc5978 commented Feb 4, 2018

I 'm sur it will be finally CUDA 9.1, not 9.0 too, isn't it ? :)

@tfboyd
Copy link
Member

tfboyd commented Feb 5, 2018

@ViktorM @Magicfeng007 @alc5978
The 9.1 thread is here if you want to follow along although it is basically closed. If you could list why you want 9.1 that would be useful, and what is your setup/configuration. A benchmark that you ran showing the perf boost would also be useful in understanding the immediate need. In meetings with NVIDIA, we both agreed there was not an immediate need to make 9.1 the default. which would then force people to upgrade their drivers again.

@meghashyam0046
Copy link

If anybody are still facing problems like Keras with TensorFlow backend not using GPU.... just follow the instructions in this page. It is updated and works 100% correctly.
https://research.wmz.ninja/articles/2017/01/configuring-gpu-accelerated-keras-in-windows-10.html

@alc5978
Copy link

alc5978 commented Feb 24, 2018

Hi All
I today install tensorflow-gpu 1.6.0rc1 on win10 with CUDA 9.0 and cuDNN 7.0.5 library with http://www.python36.com/install-tensorflow-using-official-pip-pacakage/

Everything seems ok

@ashokpant
Copy link

I created one script for NVIDIA GPU prerequisites (CUDA-9.0 and cuDNN-7.0) for the latest TensorFlow (v1.5+), here is the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature Feature requests
Projects
None yet
Development

No branches or pull requests