Upgrade to CuDNN 7 and CUDA 9 #12052

tpankaj · 2017-08-04T22:57:53Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows Server 2012
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.3.0-rc1
Python version: 3.5.2
Bazel version (if compiling from source): N/A
CUDA/cuDNN version: CUDA V8.0.44, CuDNN 6.0
GPU model and memory: Nvidia GeForce GTX 1080 Ti, 11 GB
Exact command to reproduce: N/A

Describe the problem

Please upgrade TensorFlow to support CUDA 9 and CuDNN 7. Nvidia claims this will provide a 2x performance boost on Pascal GPUs.

shivaniag · 2017-08-04T23:10:45Z

@tfboyd do you have any comments on this?

tfboyd · 2017-08-05T00:12:27Z

cuDNN 7 is still in preview mode and is being worked on. We just moved to cuDNN 6.0 with 1.3, which should go final in a couple weeks. You can download cuDNN 1.3.0rc2 if you are interested in that. I have not compiled with cuDNN 7 or CUDA 9 yet. I have heard CUDA 9 is not easy to install on all platforms and only select install packages are available. When the libraries are final we will start the final evaluation. NVIDIA has also just started sending patches to the major ML platforms to support aspects of these new libraries and I suspect there will be additional work.

Edit: I meant to say CUDA 9 is not easy to install on all platforms and instead said cuDNN. I also changed sure there will be work to I suspect there will be additional work. The rest of my silly statement I left, e.g. I did not realize cuDNN 7 went live yesterday.

tfboyd · 2017-08-05T00:19:26Z

Not saying how you should read the website. But the 2x faster on pascal looks to be part of the CUDA 8 release. I suppose it depends on how you read the site. NVIDIA has not mentioned to us that CUDA 9 is going to speed up Pascal by 2x (on everything) and while anything is possible, I would not expect that to happen.

https://developer.nvidia.com/cuda-toolkit/whatsnew

The site is a little confusing but I think the section you are quoting is nested under the CUDA 8. I only mention this so you do not have unrealistic expectations for their release. For Volta there should be some great gains from what I understand and I think (I do not now for sure) people are just getting engineering samples of Volta to start high level work to get ready for the full release.

sclarkson · 2017-08-05T00:41:59Z

@tfboyd cuDNN 7 is no longer in preview mode as of yesterday. It has been officially released for both CUDA 8.0 and CUDA 9.0 RC.

tfboyd · 2017-08-05T01:25:17Z

Ahh I missed that. Thanks @sclarkson and sorry for the wrong info.

theflofly · 2017-08-05T15:44:12Z

I will certainly try it because finally gcc 6 is supported by CUDA 9 and Ubuntu 17.04 comes with it.

tfboyd · 2017-08-05T19:12:56Z

If you have luck let the thread know. I am personally just starting to fully test cuDNN 6 (Internally it has been tested a lot but I have not been using it personally). I am often slow to upgrade to the latest stuff. My guess is you may not see any real change with cuDNN 7 until everything gets patched to use the latest APIs. I want to stress again that I am wrong all of the time. What I have seen as an outsider is the new cuDNN versions add new methods/APIs. Some are interesting and some are not immediately useful. Then those APIs get exposed via the TensorFlow API or just used behind the scenes to make existing methods faster. My very high level understanding is cuDNN 7 + CUDA 9 will enhance FP16 support with a focus on Volta. I think one of the main focuses is how to get models (many not just a few) to converge with FP16 without having to endlessly guess the right config/hyperparameters to use. I want to stress that this is how I understood the conversation and I may be incorrect or half correct. STRESS: If there are methods you think need to be added (or leverage for performance) to TensorFlow from cuDNN we are always interested in a list. Internally, this happened with cuDNN 6 and we focused on implementing the features teams said they wanted that would help their projects.

…

On Sat, Aug 5, 2017 at 8:46 AM, Courtial Florian ***@***.***> wrote: I will certainly try it because finally gcc 6 is supported by CUDA 9 and Ubuntu 17.04 comes with it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#12052 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWZessKqj_nPY1br9SD9L9SX-8Kf5Dbtks5sVI5TgaJpZM4OuRL7> .

ppwwyyxx · 2017-08-05T19:24:34Z

Speaking of methods to be added, group convolution from cudnn7 would be a important feature for vision community.

tfboyd · 2017-08-05T20:47:41Z

Cool I will add it to the list I am starting. I may forget but feel free to remind me to publish some kind of list where I can provide some guidance on what is likely being worked on. It cannot be a promise but we want feedback so we can prioritize what people want and need. Thank you Yuxin.

…

On Sat, Aug 5, 2017 at 12:26 PM, Yuxin Wu ***@***.***> wrote: Speaking of methods to be added, group convolution from cudnn7 would be a important feature for vision community. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#12052 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWZesv9udRRxy9WvsK2eUEZCj7LAGM8bks5sVMHVgaJpZM4OuRL7> .

tfboyd · 2017-08-05T23:52:35Z

I just tried to compile with cuDNN 7 with CUDA 8 and it failed which I kind of expected. There is a patch incoming from NVIDIA that should help line things up. Just a heads up if anyone is trying.

…

On Sat, Aug 5, 2017 at 1:47 PM, Toby Boyd ***@***.***> wrote: Cool I will add it to the list I am starting. I may forget but feel free to remind me to publish some kind of list where I can provide some guidance on what is likely being worked on. It cannot be a promise but we want feedback so we can prioritize what people want and need. Thank you Yuxin. On Sat, Aug 5, 2017 at 12:26 PM, Yuxin Wu ***@***.***> wrote: > Speaking of methods to be added, group convolution from cudnn7 would be a > important feature for vision community. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#12052 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AWZesv9udRRxy9WvsK2eUEZCj7LAGM8bks5sVMHVgaJpZM4OuRL7> > . >

4F2E4A2E · 2017-08-06T18:38:41Z

I am trying to get cuDNN 7 with CUDA 8/9 running. CUDA 8 is not supported by the GTX 1080 Ti - at least the installer says so ^^

I am having a big time trouble getting it running together. I want to point out this great article that sums up what i already have tried: https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html

The CUDA examples are working via Visual-Studio in both setup combinations.
Here the output of the deviceQuery.exe which was compiled using Visual-Studio:

PS C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release> deviceQuery.exe
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release\deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11264 MBytes (11811160064 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1683 MHz (1.68 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1, Device0 = GeForce GTX 1080 Ti
Result = PASS

@tfboyd do you have any link confirming the cuDNN update from Nvidea?

tpankaj · 2017-08-07T00:39:09Z

@4F2E4A2E 1080 Ti definitely supports CUDA 8.0. That's what I've been using with TensorFlow for the past several months.

colmantse · 2017-08-07T04:28:32Z

Hi all, so i have gtx 1080 ti with cuda 8.0. I am trying to install tensorflow-gpu, do i go for cuDNN 5.1, 6.0 or 7.0?

tfboyd · 2017-08-07T15:31:29Z

I suggest sticking with 5.1 for the moment. I am running some deeper perf tests on 6 and getting mixed results that need more testing to figure out.

…

On Aug 6, 2017 9:30 PM, "colmantse" ***@***.***> wrote: Hi all, so i have gtx 1080 ti with cuda 8.0. I am trying to install tensorflow-gpu, do i go for cuDNN 5.1, 6.0 or 7.0? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#12052 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWZeshvEFsdeWz-1uyzl_L6HE15E0BzSks5sVpLlgaJpZM4OuRL7> .

colmantse · 2017-08-07T16:15:07Z

thanks, i tried with cudnn 6.0 but doesn't work, i guess because of my dummy tf-gpu installation. cudnn 5.1 works for me with python 3.6

4F2E4A2E · 2017-08-08T07:41:03Z

@tpankaj Thank you! I've got it running with CUDA 8 and cuDNN 5.1

cancan101 · 2017-08-08T20:25:22Z

Here are the full set of features in cuDNN 7:

Key Features and Enhancements
This cuDNN release includes the following key features and enhancements.
Tensor Cores
Version 7.0.1 of cuDNN is the first to support the Tensor Core operations in its
implementation. Tensor Cores provide highly optimized matrix multiplication
building blocks that do not have an equivalent numerical behavior in the traditional
instructions, therefore, its numerical behavior is slightly different.
cudnnSetConvolutionMathType, cudnnSetRNNMatrixMathType, and
cudnnMathType_t
The cudnnSetConvolutionMathType and cudnnSetRNNMatrixMathType
functions enable you to choose whether or not to use Tensor Core operations in
the convolution and RNN layers respectively by setting the math mode to either
CUDNN_TENSOR_OP_MATH or CUDNN_DEFAULT_MATH.
Tensor Core operations perform parallel floating point accumulation of multiple
floating point products.
Setting the math mode to CUDNN_TENSOR_OP_MATH indicates that the library will use
Tensor Core operations.
The default is CUDNN_DEFAULT_MATH. This default indicates that the Tensor Core
operations will be avoided by the library. The default mode is a serialized operation
whereas, the Tensor Core is a parallelized operation, therefore, the two might result
in slightly different numerical results due to the different sequencing of operations.
The library falls back to the default math mode when Tensor Core operations are
not supported or not permitted.
cudnnSetConvolutionGroupCount
A new interface that allows applications to perform convolution groups in the
convolution layers in a single API call.
cudnnCTCLoss
cudnnCTCLoss provides a GPU implementation of the Connectionist Temporal
Classification (CTC) loss function for RNNs. The CTC loss function is used for
phoneme recognition in speech and handwriting recognition.
CUDNN_BATCHNORM_SPATIAL_PERSISTENT
The CUDNN_BATCHNORM_SPATIAL_PERSISTENT function is a new batch
normalization mode for cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward. This mode is similar to
CUDNN_BATCHNORM_SPATIAL, however, it can be faster for some tasks.
cudnnQueryRuntimeError
The cudnnQueryRuntimeError function reports error codes written by GPU
kernels when executing cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward with the
CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode.
cudnnGetConvolutionForwardAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionForwardAlgorithm.
cudnnGetConvolutionBackwardDataAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardAlgorithm.
cudnnGetConvolutionBackwardFilterAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardFilterAlgorithm.
CUDNN_REDUCE_TENSOR_MUL_NO_ZEROS
The MUL_NO_ZEROS function is a multiplication reduction that ignores zeros in the
data.
CUDNN_OP_TENSOR_NOT
The OP_TENSOR_NOT function is a unary operation that takes the negative of
(alpha*A).
cudnnGetDropoutDescriptor
The cudnnGetDropoutDescriptor function allows applications to get dropout
values.

tfboyd · 2017-08-11T04:40:50Z

Alright I am thinking about starting a new issue that is more of a "blog" of CUDA 9 RC + cuDNN 7.0. I have a TF build "in my hand" that is patched together but is CUDA 9RC and cuDNN 7.0 and I want to see if anyone is interesting in trying it. I also need to make sure there is not some weird reason why I cannot share it. There are changes that need to be made to some upstream libraries that TensorFlow uses but you will start to see PRs coming in from NVIDIA in the near future. I and the team were able to test CUDA 8 + cuDNN 6 on Volta and then CUDA 9RC + cuDNN 7 on Volta (V100) with FP32 code. I only do Linux builds and Python 2.7 but if all/any of you are interested I would like to try and involve the community more than we did with cuDNN 6.0. It might not be super fun but I want to offer as well as try to make this feel more like we are in this together vs. I am feeing information. I also still want to build out lists of what features we are working on but not promising for cuDNN 7 (and 6.0). @cancan101 thank you for the full list.

Froskekongen · 2017-08-11T12:19:54Z

@tfboyd: I would be grateful for descriptions on doing CUDA 9.0RC+cuDNN 7.0. I am using a weird system myself (ubuntu 17.10 beta with TF1.3, CUDA 8.0 and cuDNN 6.0 gcc-4.8), and upgrading to cuda 9 and cudnn 7 would actually be nice compilerwise.

tfboyd · 2017-08-11T14:12:25Z

I will see what I can do on getting what you need to build yourself and a binary. The performance team lead indicated I can try and make this happen so we can be more transparent and I hope have more fun as a community. Getting you the patch and how to build it not super hard but is a little harder. It will also be very informal as I do not have time to manage a branch and the patch could bit rot (not apply cleanly) very quickly. The patch was used to make sure everyone involved was ok with the changes in general and I expect individual PRs will start coming in.

…

On Fri, Aug 11, 2017 at 5:22 AM, Erlend Aune ***@***.***> wrote: @tfboyd <https://github.com/tfboyd>: I would be grateful for descriptions on doing CUDA 9.0RC+cuDNN 7.0. I am using a weird system myself (ubuntu 17.10 beta with TF1.3, CUDA 8.0 and cuDNN 6.0 gcc-4.8), and upgrading to cuda 9 and cudnn 7 would actually be nice compilerwise. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#12052 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWZesj4WRkFKNX-Nt2oKtvp0oyQVBtM5ks5sXEdqgaJpZM4OuRL7> .

theflofly · 2017-08-12T11:21:30Z

@tfboyd: I am interested, how will you share it? A branch?

tanmayb123 · 2017-08-12T17:17:11Z

@tfboyd I'd definitely be very interested as well. Thanks!

tfboyd · 2017-08-14T17:59:50Z

Trying to figure it out this week. Logistics are often harder than I think.

…

On Aug 12, 2017 10:18 AM, "Tanmay Bakshi" ***@***.***> wrote: @tfboyd <https://github.com/tfboyd> I'd definitely be very interested as well. Thanks! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#12052 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWZesjO42Rl1WCyW0KR22KgbydKh1O4Zks5sXd6AgaJpZM4OuRL7> .

4F2E4A2E · 2017-12-15T19:45:58Z

The pr is declined but it seams to be merged manually. Do we have to wait for a eigen release or is it getting built by the sources?

Fixes #12052

hadaev8 · 2017-12-15T23:27:38Z

Cool, then it will be on Nightly pip?

nasergh · 2017-12-16T08:47:58Z

@Tweakmind
i try to rebuild tensor with using python 2.7
but in bazel build i get this error
i also install numpy but no change.

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
ERROR: /home/gh2/Downloads/tensorflow/util/python/BUILD:5:1: no such package '@local_config_python//': Traceback (most recent call last):
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 310
		_create_local_python_repository(repository_ctx)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 274, in _create_local_python_repository
		_get_numpy_include(repository_ctx, python_bin)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 257, in _get_numpy_include
		_execute(repository_ctx, [python_bin, "-c",..."], <2 more arguments>)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 76, in _execute
		_python_configure_fail("\n".join([error_msg.strip() if ... ""]))
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 37, in _python_configure_fail
		fail(("%sPython Configuration Error:%...)))
Python Configuration Error: Problem getting numpy include path.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
**ImportError: No module named numpy**
Is numpy installed?
 and referenced by '//util/python:python_headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Loading failed
INFO: Elapsed time: 10.826s
FAILED: Build did NOT complete successfully (26 packages loaded)
    currently loading: tensorflow/core ... (3 packages)
    Fetching http://mirror.bazel.build/.../~ooura/fft.tgz; 20,338b 5s
    Fetching http://mirror.bazel.build/zlib.net/zlib-1.2.8.tar.gz; 19,924b 5s
    Fetching http://mirror.bazel.build/.../giflib-5.1.4.tar.gz; 18,883b 5s

masasys · 2017-12-16T09:29:02Z

It seems that OSX is excluded in version 7.0.5 of cuDNN. Does anyone know a detailed thing?

eeilon79 · 2017-12-16T10:39:55Z

I still can't get tensorflow-gpu to work in Windows 10 (with CUDA 9.0.176 and cudnn 7.0).
I've uninstalled both tensorflow and tensorflow-gpu and reinstalled them (with the --no-cache-dir to ensure downloading of most recent version with the eigen workaround). When I install both, my GPU is not recognized:

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'random_uniform_1/sub': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.

When I install just tensorflow-gpu it complains about a missing dll:

ImportError: Could not find 'cudart64_80.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 8.0 from this URL: https://developer.nvidia.com/cuda-toolkit

Which is weird because my CUDA version is 9.0, not 8.0, and is recognized (deviceQuery test passed).
My python version is 3.6.3. I'm trying to run this code in Spyder (3.2.4) in order to test tensorflow-gpu.
What did I miss?

hadaev8 · 2017-12-16T19:09:58Z

I'm trying to build from source by bazel on win 7, get error

No toolcahin for cpu 'x64_windows'

Can anyone build whl?

Tweakmind · 2017-12-16T22:08:30Z

@hadaev8, I need a lot more information to help. I can work on a whl but it will have heavy dependencies and not Win7, once I solve MacOS, I will solve Win10. In any case, post your details.

@eeilon79, I need to recreate this under Win10. I'm currently focused on MacOS now that Ubuntu is solved. I will come back to Win 10.

Tweakmind · 2017-12-16T22:09:45Z

@nasergh, is there a requirement for python 2.7?

Tweakmind · 2017-12-16T22:23:17Z

With CUDA 8.0 and cuDNN 6.0, this is how I installed TensorFlow from source for Cuda GPU and AVX2 support in Win10::

Requirements:

* Windows 10 64-Bit
* Visual Studio 15 C++ Tools
* NVIDIA CUDA® Toolkit 8.0
* NVIDIA cuDNN 6.0 for CUDA 8.0
* Cmake
* Swig

Install Visual Studio Community Edition Update 3 w/Windows Kit 10.0.10240.0
Follow instructions at: https://github.com/philferriere/dlwin (Thank you Phil)

Create a Virtual Drive N: for clarity
I suggest creating a directory off C: or your drive of choice and creating N: based on these instructions (2GB min):
https://technet.microsoft.com/en-us/library/gg318052(v=ws.10).aspx

Install Cuda 8.0 64-bit
https://developer.nvidia.com/cuda-downloads (Scroll down to Legacy)

Install cuDNN 6.0 for Cuda 8.0
https://developer.nvidia.com/rdp/cudnn-download
Put cuda folder from zip on N:\and rename cuDNN-6

Install CMake
https://cmake.org/files/v3.10/cmake-3.10.0-rc5-win64-x64.msi

Install Swig (swigwin-3.0.12)
https://sourceforge.net/projects/swig/files/swigwin/swigwin-3.0.12/swigwin-3.0.12.zip

cntk-py36

activate cntk-py36
pip install https://cntk.ai/PythonWheel/GPU/cntk-2.2-cp36-cp36m-win_amd64.whl
python -c "import cntk; print(cntk.__version__)"
conda install pygpu
pip install keras

Remove old tensorflow in Tools if it exists

move tensorflow tensorflow.not
git clone --recursive https://github.com/tensorflow/tensorflow.git
cd C:\Users\%USERNAME%\Tools\tensorflow\tensorflow\contrib\cmake
Edit CMakeLists.txt

Comment out these:

# if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
#   include(CheckCXXCompilerFlag)
#   CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#   if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#     set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
#   endif()
# endif()

Add these:

if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
  include(CheckCXXCompilerFlag)
  CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
  if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
  else()
    CHECK_CXX_COMPILER_FLAG("/arch:AVX2" COMPILER_OPT_ARCH_AVX_SUPPORTED)
    if(COMPILER_OPT_ARCH_AVX_SUPPORTED)
      set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
    endif()
  endif()
endif()

mkdir build & cd build

"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat"

cmake .. -A x64 -DCMAKE_BUILD_TYPE=Release ^
-DSWIG_EXECUTABLE=N:/swigwin-3.0.12/swig.exe ^
-DPYTHON_EXECUTABLE=N:/Anaconda3/python.exe ^
-DPYTHON_LIBRARIES=N:/Anaconda3/libs/python36.lib ^
-Dtensorflow_ENABLE_GPU=ON ^
-DCUDNN_HOME="n:\cuDNN-6" ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS=/arch:AVX2

-- Building for: Visual Studio 14 2015
-- Selecting Windows SDK version 10.0.14393.0 to target Windows 10.0.16299.
-- The C compiler identification is MSVC 19.0.24225.1
-- The CXX compiler identification is MSVC 19.0.24225.1
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED - Failed
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED - Success
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED - Success
-- Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0 (found suitable version "8.0", minimum required is "8.0")
-- Found PythonInterp: C:/Users/%USERNAME%/Anaconda3/python.exe (found version "3.6.3")
-- Found PythonLibs: C:/Users/%USERNAME%/Anaconda3/libs/python36.lib (found version "3.6.3")
-- Found SWIG: C:/Users/%USERNAME%/Tools/swigwin-3.0.12/swig.exe (found version "3.0.12")
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/%USERNAME%/Tools/tensorflow/tensorflow/contrib/cmake/build

MSBuild /p:Configuration=Release tf_python_build_pip_package.vcxproj

hadaev8 · 2017-12-16T22:33:38Z

@Tweakmind
python 3.6, tensorflow last from master, cuda 9.0, cudnn 7.0.5 for cuda 9.0, basel and swig loaded today.

argman · 2017-12-17T16:26:01Z

@Tweakmind do you build with master or ?

hadaev8 · 2017-12-17T21:01:21Z

@Tweakmind
May you build on windows with cuda 9 cudnn 7 and share .whl?

alc5978 · 2017-12-21T10:25:06Z

@Tweakmind

Don't you try to build on win 10 with cuda 9 cudnn 7 ?

Thanks for your expertise !

whatever1983 · 2017-12-24T04:28:55Z

@hadaev8 @alc5978
pip install -U tf-nightly-gpu now gives a win10 build dated 2017122, which is based on TF 1.5 beta with CUDA 9.0 and CuDNN 7.0.5. I ran it last night, it is ok. Now we should move onto CUDA 9.1 for the 12x CUDA kernel launch speed. Tensorflow windows support is pretty slow and anemic. Stable official builds should be offered ASAP. I am actually for Tensorflow 1.5 stable to be released with CUDA 9.1, by the end of January please?

arunmandal53 · 2018-01-01T05:09:54Z

Go to http://www.python36.com/install-tensorflow141-gpu/ for step by step installation of tensorflow with cuda 9.1 and cudnn7.05 on ubuntu. And go to http://www.python36.com/install-tensorflow-gpu-windows for step by step installation of tensorflow with cuda 9.1 and cudnn 7.0.5 on Windows.

tonmoyborah · 2018-01-24T13:17:30Z

It's 2018, almost end of January and installation of TF with CUDA9.1 and CuDNN7 on Windows 10 is still not done?

tfboyd · 2018-01-24T19:49:37Z

1.5 is RC with CUDA 9 + cuDNN 7 and should go GA in the next few days. (CUDA 9.1 was GA in December and requires another device driver upgrade that is disruptive to many users. The current plan is to keep the default build on CUDA 9.0.x and keep upgrading to newer cuDNN versions).

I opened an issue to discuss CUDA 9.1.

The 12x kernel launch speed improvement is more nuanced than the 12x number. The top end of 12x is for ops with a lot of arguments and the disruption to users is high due to the device driver upgrade. I hope to have a "channel" testing 9.1 in the near future and figure out how to deal with this paradigm.

ViktorM · 2018-01-24T20:49:46Z

I hope it will be finally CUDA 9.1, not 9.0.

Magicfeng007 · 2018-02-04T16:00:13Z

I hope it will be finally CUDA 9.1, not 9.0 too.

alc5978 · 2018-02-04T16:36:12Z

I 'm sur it will be finally CUDA 9.1, not 9.0 too, isn't it ? :)

tfboyd · 2018-02-05T16:36:09Z

@ViktorM @Magicfeng007 @alc5978
The 9.1 thread is here if you want to follow along although it is basically closed. If you could list why you want 9.1 that would be useful, and what is your setup/configuration. A benchmark that you ran showing the perf boost would also be useful in understanding the immediate need. In meetings with NVIDIA, we both agreed there was not an immediate need to make 9.1 the default. which would then force people to upgrade their drivers again.

meghashyam0046 · 2018-02-10T21:57:40Z

If anybody are still facing problems like Keras with TensorFlow backend not using GPU.... just follow the instructions in this page. It is updated and works 100% correctly.
https://research.wmz.ninja/articles/2017/01/configuring-gpu-accelerated-keras-in-windows-10.html

alc5978 · 2018-02-24T16:53:35Z

Hi All
I today install tensorflow-gpu 1.6.0rc1 on win10 with CUDA 9.0 and cuDNN 7.0.5 library with http://www.python36.com/install-tensorflow-using-official-pip-pacakage/

Everything seems ok

ashokpant · 2018-03-06T04:36:10Z

I created one script for NVIDIA GPU prerequisites (CUDA-9.0 and cuDNN-7.0) for the latest TensorFlow (v1.5+), here is the link.

shivaniag added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests labels Aug 4, 2017

tfboyd self-assigned this Aug 5, 2017

tfboyd removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 5, 2017

This was referenced Aug 9, 2017

Compiling from source #12091

Closed

Build error with Tensorflow 1.3.0 and cuDNN 7.0 #12139

Closed

reedwm mentioned this issue Aug 18, 2017

compilation error #12406

Closed

gunan mentioned this issue Dec 15, 2017

Bump the eigen dependency version. #15405

Merged

gunan closed this as completed in #15405 Dec 15, 2017

gunan added a commit that referenced this issue Dec 15, 2017

Bump the eigen dependency version. (#15405)

908343b

Fixes #12052

hadaev8 mentioned this issue Dec 17, 2017

Can't build with basel Python Configuration Error: --define PYTHON_BIN_PATH #15423

Closed

kwotsin mentioned this issue Jan 1, 2018

Unable to compile tensorflow r1.4 from source with cuda 8.0 and cudnn 7 and after downgrading bazel? #15650

Closed

ppwwyyxx mentioned this issue Feb 17, 2019

Add support for cudnn's group convolution. #25818

Merged

Upgrade to CuDNN 7 and CUDA 9 #12052

Upgrade to CuDNN 7 and CUDA 9 #12052

Comments

tpankaj commented Aug 4, 2017 • edited

System information

Describe the problem

shivaniag commented Aug 4, 2017

tfboyd commented Aug 5, 2017 • edited

tfboyd commented Aug 5, 2017

sclarkson commented Aug 5, 2017

tfboyd commented Aug 5, 2017 • edited

theflofly commented Aug 5, 2017

tfboyd commented Aug 5, 2017 via email

ppwwyyxx commented Aug 5, 2017

tfboyd commented Aug 5, 2017 via email

tfboyd commented Aug 5, 2017 via email

4F2E4A2E commented Aug 6, 2017

tpankaj commented Aug 7, 2017

colmantse commented Aug 7, 2017

tfboyd commented Aug 7, 2017 via email

colmantse commented Aug 7, 2017

4F2E4A2E commented Aug 8, 2017

cancan101 commented Aug 8, 2017

tfboyd commented Aug 11, 2017

Froskekongen commented Aug 11, 2017

tfboyd commented Aug 11, 2017 via email

theflofly commented Aug 12, 2017

tanmayb123 commented Aug 12, 2017

tfboyd commented Aug 14, 2017 via email

4F2E4A2E commented Dec 15, 2017

hadaev8 commented Dec 15, 2017

nasergh commented Dec 16, 2017

masasys commented Dec 16, 2017

eeilon79 commented Dec 16, 2017 • edited

hadaev8 commented Dec 16, 2017 • edited

Tweakmind commented Dec 16, 2017

Tweakmind commented Dec 16, 2017

Tweakmind commented Dec 16, 2017

cntk-py36

Remove old tensorflow in Tools if it exists

hadaev8 commented Dec 16, 2017

argman commented Dec 17, 2017

hadaev8 commented Dec 17, 2017

alc5978 commented Dec 21, 2017

whatever1983 commented Dec 24, 2017 • edited

arunmandal53 commented Jan 1, 2018 • edited

tonmoyborah commented Jan 24, 2018

tfboyd commented Jan 24, 2018 • edited

ViktorM commented Jan 24, 2018

Magicfeng007 commented Feb 4, 2018

alc5978 commented Feb 4, 2018

tfboyd commented Feb 5, 2018 • edited

meghashyam0046 commented Feb 10, 2018

alc5978 commented Feb 24, 2018

ashokpant commented Mar 6, 2018

tpankaj commented Aug 4, 2017 •

edited

tfboyd commented Aug 5, 2017 •

edited

tfboyd commented Aug 5, 2017 •

edited

eeilon79 commented Dec 16, 2017 •

edited

hadaev8 commented Dec 16, 2017 •

edited

whatever1983 commented Dec 24, 2017 •

edited

arunmandal53 commented Jan 1, 2018 •

edited

tfboyd commented Jan 24, 2018 •

edited

tfboyd commented Feb 5, 2018 •

edited