硬件配置

主机：Dell T620塔式服务器

显卡：Nvidia Tesla K20c

系统&驱动配置

Ubuntu Server 16.04

Nvidia driver：375.26

CUDA 8.0

cuDNN 7.0.3

tensorflow1.3

1 ubuntu server系统安装

系统：Ubuntu Server 16.04

安装方式：光盘安装（使用U盘安装会出现ISO文件无法挂载的问题）

光盘刻录系统步骤：

ISO文件打开方式选择Windows光盘映像刻录机
点击刻录

1.1 网络

安装后可能DHCP服务没有启动，需要手动启动使用dhclient命令

使用ifconfig可以看到网卡信息

zjw@t620:~$ ifconfig
eno1      Link encap:Ethernet  HWaddr f0:1f:af:e8:79:0e
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Memory:dad00000-dadfffff

eno2      Link encap:Ethernet  HWaddr f0:1f:af:e8:79:0f
          inet addr:219.223.196.65  Bcast:219.223.199.255  Mask:255.255.248.0
          inet6 addr: 2001:250:3c02:200:760:ae61:106:5168/128 Scope:Global
          inet6 addr: fe80::306f:6961:652b:a64b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16373 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6720 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:15745390 (15.7 MB)  TX bytes:971604 (971.6 KB)
          Memory:dae00000-daefffff

idrac     Link encap:Ethernet  HWaddr f0:1f:af:e8:79:11
          inet addr:169.254.0.2  Bcast:169.254.0.255  Mask:255.255.255.0
          inet6 addr: fe80::b5f1:8170:e317:6c31/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2 errors:0 dropped:0 overruns:0 frame:0
          TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:594 (594.0 B)  TX bytes:4752 (4.7 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:232 errors:0 dropped:0 overruns:0 frame:0
          TX packets:232 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:25105 (25.1 KB)  TX bytes:25105 (25.1 KB)

本服务器的地址就是eno2网卡的地址

在校园网需要通过浏览器图形界面登陆账号密码才能上网，使用以下命令代替

1	curl -X POST -F 'action=login' -F 'username=账户' -F 'password=密码' -F 'ac_id=1' -F 'ajax=1' 1' http://10.0.10.66/include/auth_action.php

在使用前需要先能ping通10.0.10.66这个地址，在登陆后测试是否已经联网

1
2
3

zjw@t620:~$ curl www.baidu.com
<!DOCTYPE html>
<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=http://s1.bdstatic.com/r/www/cache/bdorz/baidu.min.css><title>百度一下，你就知道</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=百度一下 class="bg s_btn"></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>新闻</a> <a href=http://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>地图</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>视频</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>贴吧</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&amp;tpl=mn&amp;u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>登录</a> </noscript> <script>document.write('<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u='+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ '" name="tj_login" class="lb">登录</a>');</script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">更多产品</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>关于百度</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017&nbsp;Baidu&nbsp;<a href=http://www.baidu.com/duty/>使用百度前必读</a>&nbsp; <a href=http://jianyi.baidu.com/ class=cp-feedback>意见反馈</a>&nbsp;京ICP证030173号&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>

抓取了网站的内容，已经联网

1.2 更换软件源为清华大学

Ubuntu 的软件源配置文件是 /etc/apt/sources.list。将系统自带的该文件做个备份，将该文件替换为下面内容，即可使用 TUNA 的软件源镜像。

1	$ sudo gedit /etc/apt/sources.list

修改如下：

# 默认注释了源码镜像以提高 apt update 速度，如有需要可自行取消注释
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-security main restricted universe multiverse

# 预发布软件源，不建议启用
# deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-proposed main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-proposed main restricted universe multiverse

修改后 sudo apt-get update 使修改生效

1.3 远程登陆

安装openssh： sudo apt-get install openssh-server

==================================================

查看GPU使用情况： nvidia-smi

zjw@t620:~$ nvidia-smi
Sat Sep 22 15:22:55 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20c          Off  | 00000000:02:00.0 Off |                    0 |
| 30%   33C    P0    52W / 225W |      0MiB /  4742MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

下载驱动 : https://www.nvidia.cn/Download/index.aspx?lang=cn

Ubuntu16.04 系统下K20c CUDA只能装8.0以上版本

mark

查找结果

mark

2 Pre-Installation Actions

2.1. Verify You Have a CUDA-Capable GPU

1 2	zjw@t620:~$ lspci \| grep -i nvidia 02:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)

2.2. Verify You Have a Supported Version of Linux

zjw@t620:~$  uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial

2.3. Verify the System Has gcc Installed

zjw@t620:~$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2.4. Verify the System has the Correct Kernel Headers and Development Packages Installed

1 2	zjw@t620:~$ uname -r 4.4.0-87-generic

安装对应的kernels header和开发包：

1	$ sudo apt-get install linux-headers-$(uname -r)

2.5. Choose an Installation Method

runfile 推荐 / deb /

2.6. Download the NVIDIA CUDA Toolkit

CUDA toolkit 8.0 下载地址：https://developer.nvidia.com/cuda-80-ga2-download-archive

CUDA toolkit 8.0 安装过程文档（照做基本不出问题）：https://docs.nvidia.com/cuda/archive/8.0/

3 CUDA 8.0 安装

卸载CUDA相关包：

1
2
3

sudo apt-get remove cuda  
sudo apt-get autoclean 
sudo apt-get --purge remove nvidia*	# 卸载Nvidia相关包

然后在目录切换到/esr/local/下 cd /usr/local/

sudo rm -r cuda-*

3.1 runfile安装

推荐使用runfile 方式安装（deb方式卸载的时候麻烦）

runfile 下载地址：https://developer.nvidia.com/cuda-80-ga2-download-archive

可以用wget

1	wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

1	sudo sh cuda_8.0.61_375.26_linux.run.26_linux-run

会出现一大堆选项，OPENGL安装选no，其余按照yes或者default选。如果已经安装了新的驱动，不要选择安装驱动。

-------------------------------------------------------------
Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 375.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-8.0 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is /home/cmfchina ]: 

Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so

Installing the CUDA Samples in /home/cmfchina ...
Copying samples to /home/cmfchina/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-8.0
Samples:  Installed in /home/cmfchina, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-8.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

3.2 设置环境变量

输入命令，编辑环境变量配置文件

1	sudo vim ~/.bashrc

在文本末端追加以下两行代码（按键“i”进行编辑操作）

1
2
3

export PATH=/usr/local/cuda-8.0/bin:$PATH  
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda

保存退出，执行下面命令，使环境变量立刻生效

1 2	sudo source ~/.bashrc sudo ldconfig

安装完成后后重启

3.3 检查CUDA配置

root@t620:/home/zjw# nvidia-smi
Sun Sep 23 17:31:45 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20c          Off  | 0000:02:00.0     Off |                    0 |
| 30%   32C    P0    52W / 225W |      0MiB /  4742MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

检查cuda是否配置正确

zjw@t620:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

3.4 测试CUDA的sammples

进入cude sample code目录，make 编译所有demo。注意：因为这里的make操作是将sample文件夹下所有的demo都编译了一遍，所以比较好使，如果仅仅想测试某个例子，可以进入相应的文件夹去编译即可。

# 切换到cuda-samples所在目录
cd /usr/local/cuda-8.0/samples 或者 cd /home/NVIDIA_CUDA-8.0_Samples 

# 没有make，先安装命令 sudo apt-get install cmake，-j是最大限度的使用cpu编译，加快编译的速度
make –j

# 编译完毕，切换release目录（/usr/local/cuda-8.0/samples/bin/x86_64/linux/release完整目录）
cd ./bin/x86_64/linux/release

# 检验是否成功，运行实例
./deviceQuery 

# 可以认真看看自行结果，它显示了你的NVIDIA显卡的相关信息，最后能看到Result = PASS就算成功。

编译完成后切换到 bin 目录

./deviceQuery
root@t620:/home/zjw/CUDA_Samples/NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/releas                        e# ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla K20c"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    3.5
  Total amount of global memory:                 4742 MBytes (4972412928 bytes)
  (13) Multiprocessors, (192) CUDA Cores/MP:     2496 CUDA Cores
  GPU Max Clock rate:                            706 MHz (0.71 GHz)
  Memory Clock rate:                             2600 Mhz
  Memory Bus Width:                              320-bit
  L2 Cache Size:                                 1310720 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536),                         3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simu                        ltaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Versi                        on = 8.0, NumDevs = 1, Device0 = Tesla K20c
Result = PASS

输出结果看到显卡相关信息，并且最后Result = PASS ，这说明CUDA才真正完全安装成功了

再检查一下系统和CUDA-Capable device的连接情况

root@t620:/home/zjw/CUDA_Samples/NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release# ./bandwidthTest      [CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: Tesla K20c
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     6160.9

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     6550.6

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)        Bandwidth(MB/s)
   33554432                     146967.3

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

4 cuDNN

官方安装文档（照做基本不出问题）

4.1 下载cuDNN

cuDNN是GPU加速计算深层神经网络的库。首先去官网(https://developer.nvidia.com/rdp/cudnn-download)下载cuDNN，需要注册一个账号才能下载，没有的话自己注册一个。由于本人的显卡是K20c，CUDA 8.0，最新的版本是v7：

mark

下载速度几KB。Nvidia把国内IP屏蔽了，建议代理换全局模式，用国外IP就可以下载了，亲测。

4.2 安装cuDNN

安装cudnn比较简单，简单地说，就是复制几个文件：库文件和头文件。将cudnn的头文件复制到cuda安装路径的include路径下，将cudnn的库文件复制到cuda安装路径的lib64路径下。具体操作如下

# 解压文件
zjw@t620:~$ cp cudnn-8.0-linux-x64-v7.solitairetheme8 cudnn-8.0-linux-x64-v7.tgz
zjw@t620:~$ tar -zxvf cudnn-8.0-linux-x64-v7.tgz

#切换到刚刚解压出来的文件夹路径
cd cuda 
#复制include里的头文件（记得转到include文件里执行下面命令）
sudo cp include/cudnn.h  /usr/local/cuda/include/

#复制lib64下的lib文件到cuda安装路径下的lib64（记得转到lib64文件里执行下面命令）
sudo cp lib*  /usr/local/cuda/lib64/

#设置权限
sudo chmod a+r /usr/local/cuda/include/cudnn.h 
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

#======更新软连接======
cd /usr/local/cuda/lib64/ 
sudo rm -rf libcudnn.so libcudnn.so.7   #删除原有动态文件，版本号注意变化，可在cudnn的lib64文件夹中查看   
sudo ln -s libcudnn.so.7.0.5 libcudnn.so.7  #生成软衔接（注意这里要和自己下载的cudnn版本对应，可以在/usr/local/cuda/lib64下查看自己libcudnn的版本）
sudo ln -s libcudnn.so.7 libcudnn.so #生成软链接
sudo ldconfig -v #立刻生效

备注：上面的软连接的版本号要根据自己实际下载的cudnn的lib版本号

最后我们看看验证安装cudnn后cuda是否依旧可用

zjw@t620:/usr/local/cuda/lib64$ nvcc --version	# or nvcc -V 
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

4.3 检验cuDNN是否安装成功

　　到目前为止，cuDNN已经安装完了，但是，是否成功安装，我们可以通过cuDNN sample测试一下(https://developer.nvidia.com/rdp/cudnn-archive 页面中找到对应的cudnn版本，里面有 cuDNN v5 Code Samples，点击该链接下载即可，版本可能不一样，下载最新的就行)

　　下载完，转到解压出的目录下的mnistCUDNN

# Copy the cuDNN sample to a writable path. 
$cp -r /usr/src/cudnn_samples_v7/ $HOME

# Go to the writable path
$ cd $HOME/cudnn_samples_v7/mnistCUDNN

# Compile the mnistCUDNN sample
$make clean 
$make

Run the mnistCUDNN sample

zjw@t620:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7005 , CUDNN_VERSION from cudnn.h : 7005 (7.0.5)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 13  Capabilities 3.5, SmClock 705.5 Mhz, MemSize (Mb) 4742, MemClock 2600.0 Mhz, Ecc=1, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.041376 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.079680 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.092288 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.153120 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.172800 time requiring 207360 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.061024 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.061312 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.086560 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.172704 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.186944 time requiring 207360 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!

Test passed! 至此，cuDNN已经成功安装了

5 Anaconda3

Anaconda是python的一个科学计算发行版，内置了数百个python经常会使用的库，很多是TensorFlow的依赖库。安装好Anaconda可以提供一个好的环境直接安装TensorFlow。

1	bash Anaconda3-4.2.0-Linux-x86_64.sh

　安装anaconda，回车后，是许可文件，接收许可。直接回车即可。最后会询问是否把anaconda的bin添加到用户的环境变量中，选择yes。在终端输入python发现依然是系统自带的python版本，这是因为环境变量的更新还没有生效，命令行输入如下命令是安装的anaconda生效。如果conda --version没有找到任何信息，说明没有加入到环境变量没有，需要手动加入，如图所示：

mark

1 2	root@t620:/home/zjw# vim ~/.bashrc root@t620:/home/zjw# source ~/.bashrc

检查环境变量是否生效

zjw@t620:~$ conda --version
conda 4.4.10

zjw@t620:~$ python
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

python版本为 Anaconda版本而非系统自带，说明环境变量更新生效

6 Tensorflow

　大家可以参考TensorFlow的官方安装教程（https://www.tensorflow.org/install/），官网提供的了 Pip, Docker, Virtualenv, Anaconda 或源码编译的方法安装 TensorFlow，我们这里主要介绍以Anaconda安装。其他安装方式，大家可以到官方安装教程查看。

6.1 安装TensorFlow

　　通过Anaconda安装TensorFlow CPU，TensorFlow 的官方下载源现在已经在GitHub上提供了（https://github.com/tensorflow/tensorflow），找到对应的版本号，如图所示：

官方的文档：使用 Anaconda 进行安装tensorflow https://www.tensorflow.org/install/install_linux#InstallingAnaconda

6.2 创建一个名为tensorflow的conda环境Python 3.6

#Python 2.7
conda create -n tensorflow python=2.7

#Python 3.4
conda create -n tensorflow python=3.4

#Python 3.5
conda create -n tensorflow python=3.5

#Python 3.6
conda create -n tensorflow python=3.6 　　#我下的TensorFlow对应的Python是3.6版本，那么我就使用这行

备注：(根据TensorFlow版本号，一定要设置Python版本号，切记切记切记！！！！！重要的事情说三遍！否则后面会报各种错的)

创建时出错

zjw@t620:~$ conda create -n tensorflow python=3.6
Solving environment: failed

CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://repo.continuum.io/pkgs/main/linux-64/repoa.json.bz2>
Elapsed: -
...

解决方法：（关闭VPN）

以下是辅助，不一定成功

# 首先添加清华的镜像源
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --set show_channel_urls yes

6.3 激活 conda 环境

zjw@t620:~$ conda create -n tensorflow pip python=3.6
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.4.10
  latest version: 4.5.11

Please update conda by running

    $ conda update -n base conda



## Package Plan ##

  environment location: /home/zjw/.conda/envs/tensorflow

  added / updated specs:
    - pip
    - python=3.6


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    xz-5.2.3                   |                0         667 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    tk-8.5.18                  |                0         1.9 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    wheel-0.29.0               |           py36_0          88 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    openssl-1.0.2l             |                0         3.2 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    readline-6.2               |                2         606 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    python-3.6.2               |                0        16.5 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    pip-9.0.1                  |           py36_1         1.7 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    sqlite-3.13.0              |                0         4.0 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    certifi-2016.2.28          |           py36_0         216 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    setuptools-36.4.0          |           py36_1         563 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    zlib-1.2.11                |                0         109 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    ------------------------------------------------------------
                                           Total:        29.3 MB

The following NEW packages will be INSTALLED:

    certifi:    2016.2.28-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    openssl:    1.0.2l-0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    pip:        9.0.1-py36_1     https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    python:     3.6.2-0          https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    readline:   6.2-2            https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    setuptools: 36.4.0-py36_1    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    sqlite:     3.13.0-0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    tk:         8.5.18-0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    wheel:      0.29.0-py36_0    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    xz:         5.2.3-0          https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    zlib:       1.2.11-0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free

Proceed ([y]/n)? y


Downloading and Extracting Packages
xz 5.2.3: ################################################################################### | 100%
tk 8.5.18: ################################################################################## | 100%
wheel 0.29.0: ############################################################################### | 100%
openssl 1.0.2l: ############################################################################# | 100%
readline 6.2: ############################################################################### | 100%
python 3.6.2: ############################################################################### | 100%
pip 9.0.1: ################################################################################## | 100%
sqlite 3.13.0: ############################################################################## | 100%
certifi 2016.2.28: ########################################################################## | 100%
setuptools 36.4.0: ########################################################################## | 100%
zlib 1.2.11: ################################################################################ | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate tensorflow
#
# To deactivate an active environment, use
#
#     $ conda deactivate

zjw@t620:~$ conda activate tensorflow

1	source activate tensorflow

6.4 在conda环境中安装TensorFlow GPU版

因为我们前面选择了conda环境为Python3.6的，所以我们选择Python3.6版本的GPU链接地址，进行安装

#如何进行安装，我们这里安装Python版本为3.6的TensorFlow

sudo pip3 install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0-cp36-cp36m-linux_x86_64.whl

# 备注：连接里的cpxx和cpxxm的xx是对应Python的版本号#

失败，我们需要下载GPU版的安装包，在安装包下载之后，然后手动进入环境，安装TensorFlow whl安装包。

1	(tensorflow) zjw@t620:~$ wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0-cp36-cp36m-linux_x86_64.whl

source activate tensorflow    #激活tensorflow环境（这步操作了，就忽略）
(tensorflow) zjw@t620:~$ pip install --ignore-installed --upgrade tensorflow_gpu-1.3.0-cp36-cp36m-linux_x86_64.whl
Processing ./tensorflow_gpu-1.3.0-cp36-cp36m-linux_x86_64.whl
Collecting wheel>=0.26 (from tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/81/30/e935244ca6165187ae8be876b6316ae201b71485538ffac1d718843025a9/wheel-0.31.1-py2.py3-none-any.whl (41kB)
    100% |████████████████████████████████| 51kB 96kB/s
Collecting numpy>=1.11.0 (from tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/22/02/bae88c4aaea4256d890adbf3f7cf33e59a443f9985cf91cd08a35656676a/numpy-1.15.2-cp36-cp36m-manylinux1_x86_64.whl (13.9MB)
    100% |████████████████████████████████| 13.9MB 46kB/s
Collecting protobuf>=3.3.0 (from tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/c2/f9/28787754923612ca9bfdffc588daa05580ed70698add063a5629d1a4209d/protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl (1.1MB)
    100% |████████████████████████████████| 1.1MB 110kB/s
Collecting six>=1.10.0 (from tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/67/4b/141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a/six-1.11.0-py2.py3-none-any.whl
Collecting tensorflow-tensorboard<0.2.0,>=0.1.0 (from tensorflow-gpu==1.3.0)
  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.python.org', port=443): Read timed out. (read timeout=15)",)': /simple/tensorflow-tensorboard/
  Downloading https://files.pythonhosted.org/packages/93/31/bb4111c3141d22bd7b2b553a26aa0c1863c86cb723919e5bd7847b3de4fc/tensorflow_tensorboard-0.1.8-py3-none-any.whl (1.6MB)
    100% |████████████████████████████████| 1.6MB 531kB/s
Collecting setuptools (from protobuf>=3.3.0->tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/6e/9c/cc2eb661d85f4aa541910af1a72b834a0f5c9209079fcbd1438fa6da17c6/setuptools-40.4.2-py2.py3-none-any.whl (569kB)
    100% |████████████████████████████████| 573kB 303kB/s
Collecting werkzeug>=0.11.10 (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
    100% |████████████████████████████████| 327kB 329kB/s
Collecting bleach==1.5.0 (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/33/70/86c5fec937ea4964184d4d6c4f0b9551564f821e1c3575907639036d9b90/bleach-1.5.0-py2.py3-none-any.whl
Collecting html5lib==0.9999999 (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/ae/ae/bcb60402c60932b32dfaf19bb53870b29eda2cd17551ba5639219fb5ebf9/html5lib-0.9999999.tar.gz (889kB)
    100% |████████████████████████████████| 890kB 274kB/s
Collecting markdown>=2.6.8 (from tensorflow-tensorboard<0.2.0,>=0.1.0->tensorflow-gpu==1.3.0)
  Downloading https://files.pythonhosted.org/packages/7a/fd/e22357c299e93c0bc11ec8ba54e79f98dd568e09adfe9b39d6852c744938/Markdown-3.0-py2.py3-none-any.whl (89kB)
    100% |████████████████████████████████| 92kB 331kB/s
Building wheels for collected packages: html5lib
  Running setup.py bdist_wheel for html5lib ... done
  Stored in directory: /home/zjw/.cache/pip/wheels/50/ae/f9/d2b189788efcf61d1ee0e36045476735c838898eef1cad6e29
Successfully built html5lib
Installing collected packages: wheel, numpy, six, setuptools, protobuf, werkzeug, html5lib, bleach, markdown, tensorflow-tensorboard, tensorflow-gpu
Successfully installed bleach-1.5.0 html5lib-0.9999999 markdown-3.0 numpy-1.15.2 protobuf-3.6.1 setuptools-40.4.2 six-1.11.0 tensorflow-gpu-1.3.0 tensorflow-tensorboard-0.1.8 werkzeug-0.14.1 wheel-0.31.1
You are using pip version 9.0.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

6.5 当你不用 TensorFlow 的时候，关闭环境

1	source deactivate tensorflow

6.6 安装成功后,每次使用 TensorFlow 的时候需要激活 conda 环境（操作步骤2就可以了）

6.7 常见问题

出现“ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory”错误信息

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
Traceback (most recent call last):
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/cmfchina/.conda/envs/tensorflow/lib/python3.6/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

解决方法：

首先检查是否存在libcundnn.so.*
1
find / -name libcudnn.so.*
找到文件就下一步，没找到，检查下cudnn的依赖库，就是前面的环境变量做对了没

建立硬连接

1
2
3

sudo ln -s <path>libcudnn.so.7.*  <path>libcudnn.so.6　　#path就是libcudnn.so.7的所在目录或者

sudo ln -s  libcudnn.so.7.*  libcudnn.so.6　　#cd 到 libcudnn.so.7的所在目录

6.8 卸载TensorFlow

　　如果我们需要卸载TensorFlow的话，使用下面命令

1 2	sudo pip uninstall tensorflow 　　#Python2.7 sudo pip3 uninstall tensorflow 　　#Python3.x

6.9 测试Tensorflow

(tensorflow) zjw@t620:/usr/local$ python
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:51:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2018-09-24 10:25:36.621775: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-09-24 10:25:36.621832: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-09-24 10:25:36.621852: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-09-24 10:25:38.176557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties:
name: Tesla K20c
major: 3 minor: 5 memoryClockRate (GHz) 0.7055
pciBusID 0000:02:00.0
Total memory: 4.63GiB
Free memory: 4.57GiB
2018-09-24 10:25:38.176610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0
2018-09-24 10:25:38.176620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y
2018-09-24 10:25:38.176637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K20c, pci bus id: 0000:02:00.0)
>>> sess.run(hello)
b'Hello, TensorFlow!'
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> b = tf.constant(32)
>>> sess.run(a + b)
42
>>> sess.close()

7 参考文档

https://www.cnblogs.com/xuliangxing/p/7575586.html