近期在学习深度学习,需要在本机上安装keras框架,好上手。上网查了一些资料,弄了几天今天终于完全搞好了。本次是使用GPU进行加速,使用cpu处理的请查看之前的随笔keras在win7下环境搭建

本机配置:win7 64位的,4G内存,gtx970显卡

安装条件:  

  vs2010(不一定非要是vs2010,恰好我有vs2010,应该是配置GPU编程时需要用到vs的编译器)

  cuda如果系统是64位的就下载64位,至于cuda的版本,有的说要和对应的显卡版本匹配,我就安装了8.0,实验来看,cuda版本和显卡型号貌似关系不是很大。

  cudnn是深度学习进行加速的。不是必选,但是有的话以后运行效率会高很多。版本什么的一定要配套。

 

前面的过程和使用cpu计算是相同的。请参考之前随笔。keras在win7下环境搭建

之前的步骤处理完之后,

 

1 安装VS2010,只选择装C++语言就够。

2 安装cuda 安装Cuda8,安装的时候,选择“自定义安装”,安装全部功能,还有要安装到默认位置最好,安装很简单,可能需要点时间。

  安装完后,打开环境变量应该会多出来2个变量,CUDA_PATH_V6_5和CUDA_PATH.

  打开cmd控制台命令行,输入命令nvcc –V回车(注意是大写V)就可以查看版本信息,如果安装正确会显示Cuda的版本号。

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sat_Sep__3_19:05:48_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

 

3 修改配置.theanorc.txt,如下:

[global]
openmp=False
device = gpu0
floatX = float32
allow_input_downcast=True
[lib]
cnmem = 1
[blas]
ldflags=
[gcc]
cxxflags=-ID:\Anaconda2\MinGW  #此处是gcc的路径
[nvcc]
flags = -LD:\Anaconda2\libs  #此处是Anaconda的路径 
compiler_bindir = D:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin  #此处一定要和你安装的VS的路径保持一致,如果是默认安装的,应该是C:\Program Files(x86)\Microsoft Visual Studio 10.0\VC\bin
fastmath = True

注意:网上有的配置文件中没有[lib]这个块,后面导入theano时会出现CNMeM is disabled提示。

 

4 安装cudnn

  将下载来的文件解压,解压出cuda文件夹,里面包含3个文件夹。将设三个文件夹替换掉系统里面的对应文件,进行覆盖替换即可。C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0

注意:如果没有覆盖掉后面导入theano时会出现CuDNN not available提示。

 

5 切换后端,因为我用的是theano,而keras默认使用tensorflow。切换方法有英文资料

Switching from one backend to another

If you have run Keras at least once, you will find the Keras configuration file at:

~/.keras/keras.json

If it isn't there, you can create it.

The default configuration file looks like this:

{

    "image_dim_ordering": "tf",

    "epsilon": 1e-07,

    "floatx": "float32",

    "backend": "tensorflow"

}

Simply change the field backend to either "theano" or "tensorflow", and Keras will use the new configuration next time you run any Keras code.

照着做就行了。

 

6 此时正常来说应该就可以了,进行一下测试。测试代码如下

测试1,在cmd命令窗口下输入

>>> import theano
DEBUG: nvcc STDOUT nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' archite
ctures are deprecated, and may be removed in a future release (Use -Wno-deprecat
ed-gpu-targets to suppress warning).
nvcc warning : nvcc support for Microsoft Visual Studio 2010 and earlier has bee
n deprecated and is no longer being maintained
mod.cu
support for Microsoft Visual Studio 2010 has been deprecated!
   正在创建库 C:/Users/allen/AppData/Local/Theano/compiledir_Windows-7-6.1.7601-
SP1-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmp1wscvx/265abc
51f7c376c224983485238ff1a5.lib 和对象 C:/Users/allen/AppData/Local/Theano/compil
edir_Windows-7-6.1.7601-SP1-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.
7.12-64/tmp1wscvx/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 95.0% o
f memory, cuDNN 5005)

 

  

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')
DEBUG: nvcc STDOUT nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : nvcc support for Microsoft Visual Studio 2010 and earlier has been deprecated and is no longer being maintained
mod.cu
support for Microsoft Visual Studio 2010 has been deprecated!
   ���ڴ����� C:/Users/allen/AppData/Local/Theano/compiledir_Windows-7-6.1.7601-SP1-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpmdncsl/265abc51f7c376c224983485238ff1a5.lib �Ͷ��� C:/Users/allen/AppData/Local/Theano/compiledir_Windows-7-6.1.7601-SP1-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.12-64/tmpmdncsl/265abc51f7c376c224983485238ff1a5.exp

Using gpu device 0: GeForce GTX 970 (CNMeM is enabled with initial size: 95.0% of memory, cuDNN 5005)
 
[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.572000 seconds
Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761
  1.62323296]
Used the gpu

如果显示使用GPU则一切正常。