浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案

 

本文是截止目前为止最强攻略,按照本文方法基本可以无压力应对caffe和Ross B. Girshick的代码安装配置,如有转载请注明出处

Copyright 飞翔的蜘蛛人

 

注1:本人新手,文章中不准确的地方,欢迎批评指正

注2:阅读本文前请先熟悉:

1)      Linux的基本操作

2)      熟悉Ubuntu系统下nvidia驱动及cuda安装,请见我的另一篇博客

基于UBUNTU14.04系统的NVIDIA TESLA K40驱动和CUDA 7.5安装笔记

http://www.cnblogs.com/muchong/p/6093328.html

3)      熟悉cudnn的安装和caffe的安装,请见YaoyaoLiu

Caffe配置简明教程 ( Ubuntu 14.04 / CUDA 7.5 / cuDNN 5.1 )

http://www.cnblogs.com/yaoyaoliu/p/5850993.html

以及caffe官方Caffe installation instructions说明

http://caffe.berkeleyvision.org/installation.html

 

 

一.Caffe installation

如果熟悉上述1,2,3的同学,那么最新版本的Caffe安装基本不会出现什么问题,如有问题请见上述推荐文章。

上述文章可能未覆盖到的问题

error while loading shared libraries: libcudnn.so.5: cannot open shared object file: No such file or directory

原因:自己安装的caffe库文件所在路径未添加到/etc/ld.so.conf文件中

解决方法

进入自己安装的caffe库文件所在位置,并把路径添加到/etc/ld.so.conf文件中

tju@tju-System-Product-Name:~$ cd tju/caffe/build/lib/

tju@tju-System-Product-Name:~/tju/caffe/build/lib$ pwd

/home/tju/tju/caffe/build/lib

 

tju@tju-System-Product-Name:~/tju/caffe$ cd /etc/ld.so.conf.d/

tju@tju-System-Product-Name:/etc/ld.so.conf.d$ ls -l

总用量 20

-rw-r--r-- 1 root root 22 11月 24 18:17 cuda.conf

-rw-rw-r-- 1 root root 38  3月 24  2014 fakeroot-x86_64-linux-gnu.conf

lrwxrwxrwx 1 root root 41 11月 24 18:09 i386-linux-gnu_EGL.conf -> /etc/alternatives/i386-linux-gnu_egl_conf

lrwxrwxrwx 1 root root 40 11月 24 18:09 i386-linux-gnu_GL.conf -> /etc/alternatives/i386-linux-gnu_gl_conf

-rw-r--r-- 1 root root 44  8月 10  2009 libc.conf

-rw-r--r-- 1 root root 68  4月 12  2014 x86_64-linux-gnu.conf

lrwxrwxrwx 1 root root 43 11月 24 18:09 x86_64-linux-gnu_EGL.conf -> /etc/alternatives/x86_64-linux-gnu_egl_conf

lrwxrwxrwx 1 root root 42 11月 24 18:09 x86_64-linux-gnu_GL.conf -> /etc/alternatives/x86_64-linux-gnu_gl_conf

-rw-r--r-- 1 root root 56  5月 26 18:28 zz_i386-biarch-compat.conf

 

tju@tju-System-Product-Name:/etc/ld.so.conf.d$ sudo touch libcudnn.conf

tju@tju-System-Product-Name:/etc/ld.so.conf.d$ sudo gedit libcudnn.conf

写入/home/tju/tju/caffe/build/lib

sudo ldconfig

 

二.我对py-faster-rcnn的作死安装方式

https://github.com/rbgirshick/py-faster-rcnn

讲在前面的话:

  1. 前置相关依赖项请用terminal安装,尽量不使用类似新立得软件包管理器等安装方式。
  2. 请先按照原文中的安装流程,先安装Python packages中cython, python-opencv, easydict这样可以减少几个编译中的报错,

安装命令:

sudo apt-get install cython

sudo apt-get install python-opencv

sudo pip install easydict

3. py-faster-rcnn存放位置路径中不能有中文,否则报错UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position

 

安装py-faster-rcnn

 

cd 要存放文件的位置

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git

cd py-faster-rcnn/lib/

make

 

删除caffe-faster-rcnn中所有文件

拷贝未经编译过的最新版caffe中所有文件到caffe-faster-rcnn中

cd ../caffe-fast-rcnn/

cd python/

for req in $(cat requirements.txt); do sudo pip install $req; done

cd ..

cp Makefile.config.example Makefile.config

make all -j16

make test -j16

make runtest -j16

跑通

make pycaffe

跑通

 

把下好的faster_rcnn_models放到py-faster-rcnn/data下

./tools/demo.py

下面神奇的事情发生了,层出不穷的报错,直到我无法解决的错误

 

1.ImportError: No module named scipy

解决:sudo pip install scipy

2.error: library dfftpack has Fortran sources but no Fortran compiler found

解决:sudo apt-get install gfortran

http://blog.csdn.net/u010551621/article/details/46363853

3.Error parsing text-format caffe.NetParameter: 350:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param"

ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt

详细报错内容

/usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.

  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

WARNING: Logging before InitGoogleLogging() is written to STDERR

W1124 23:36:53.414587  9519 _caffe.cpp:122] DEPRECATION WARNING - deprecated use of Python interface

W1124 23:36:53.414649  9519 _caffe.cpp:123] Use this instead (with the named "weights" parameter):

W1124 23:36:53.414660  9519 _caffe.cpp:125] Net('/home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt', 1, weights='/home/tju/tju/py-faster-rcnn/data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel')

[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 350:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param".

F1124 23:36:53.437803  9519 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt

*** Check failure stack trace: ***

解决:这个问题基本已经百度不到了,换google,中文没找到解决方案,英文中有一个相关问题解答,但是没说如何解决

问题原因:caffe版本不对

基本思路:need to update your build of Caffe and Run upgrade_net.cpp

 

过程中遇到的一堆问题及解决方法:

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp

upgrade_net_proto_text.cpp:10:27: fatal error: caffe/caffe.hpp: 没有那个文件或目录

 #include "caffe/caffe.hpp"

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools sudo nautilus

把py-faster-rcnn/caffe-fast-rcnn/include/caffe下的caffe.hpp移动到/usr/include/

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe$ cd /usr/include/

tju@tju-System-Product-Name:/usr/include$ ls | grep caffe

caffe.hpp

gcc所调用的库文件要添加到路径中去/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp -I /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/ -o upgrade_net_proto_text

 

In file included from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/common.hpp:19:0,

                 from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/blob.hpp:8,

                 from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/caffe.hpp:7,

                 from upgrade_net_proto_text.cpp:10:

/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/util/device_alternate.hpp:34:23: fatal error: cublas_v2.h: 没有那个文件或目录

 #include <cublas_v2.h>

                       ^

compilation terminated.

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$

 

tju@tju-System-Product-Name:~$ cd /usr/local/cuda-7.5/targets/x86_64-linux/include/

tju@tju-System-Product-Name:/usr/local/cuda-7.5/targets/x86_64-linux/include$ ls | grep cublas.h

cublas.h

tju@tju-System-Product-Name:/usr/local/cuda-7.5/targets/x86_64-linux/include$ pwd

/usr/local/cuda-7.5/targets/x86_64-linux/include

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp -I /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/ -I /usr/local/cuda-7.5/targets/x86_64-linux/include -o upgrade_net_proto_text

 

In file included from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/caffe.hpp:7:0,

                 from upgrade_net_proto_text.cpp:10:

/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/blob.hpp:9:34: fatal error: caffe/proto/caffe.pb.h: 没有那个文件或目录

 #include "caffe/proto/caffe.pb.h"

                                  ^

compilation terminated.

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ sudo protoc caffe.proto  --cpp_out=.

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ ls -l

总用量 1916

-rw-r--r-- 1 root root 1103165 11月 25 01:50 caffe.pb.cc

-rw-r--r-- 1 root root  794370 11月 25 01:50 caffe.pb.h

-rw-rw-r-- 1 tju  tju    57711 11月 24 19:02 caffe.proto

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ pwd

/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto

 

tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn$ mkdir include/caffe/proto

 

root@tju-System-Product-Name:/home/tju/桌面# cd py-faster-rcnn/

root@tju-System-Product-Name:/home/tju/桌面/py-faster-rcnn# cd lib/

root@tju-System-Product-Name:/home/tju/桌面/py-faster-rcnn/lib# make

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 10: ordinal not in range(128)

building 'utils.cython_bbox' extension

 

utils/bbox.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.

 #error Do not use this file, it is the result of a failed Cython compilation.

  ^

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

make: *** [all] 错误 1

原因:路径中有中文,换个地方放

 

src/caffe/layer_factory.cpp:

替换文件

/home/tju/tju/software/caffe/src/caffe/layer_factory.cpp

 

/usr/local/cuda/include/cudnn.h

替换文件

 

src/caffe/layers/cudnn_tanh_layer.cu

替换文件

 

src/caffe/layers/cudnn_tanh_layer.cpp:16:45: error: ‘activ_desc_’ was not declared in this scope

   cudnn::createActivationDescriptor<Dtype>(&activ_desc_, CUDNN_ACTIVATION_TANH);

caffe旧官方版本对cudnn只支持到特定的版本,不支持最新的V5版本,v5中需要手动修改

 

include/caffe/layers/cudnn_tanh_layer.hpp, src/caffe/layers/cudnn_tanh_layer.cpp, src/caffe/layers/cudnn_tanh_layer.cu

 

include/caffe/layers/cudnn_sigmoid_layer.hpp, src/caffe/layers/cudnn_sigmoid_layer.cpp, src/caffe/layers/cudnn_sigmoid_layer.cu

 

include/caffe/layers/cudnn_relu_layer.hpp, src/caffe/layers/cudnn_relu_layer.cpp, src/caffe/layers/cudnn_relu_layer.cu

 

src/caffe/layers/cudnn_conv_layer.cu

把原来改成old_的原.cu和.cpp删掉

 

再gcc就出现了我解决不了的编译问题了… …

Check failed: registry.count(type) == 0 (1 vs. 0) Layer type Convolution already registered.

需要改代码 工作量太大,改到夜里两点半,发现改不动。。。编译时各种报错,放弃这种安装方法

 

三.简单粗暴的py-faster-rcnn安装方式

通过上述方法,可以得出结论py-faster-rcnn不支持最新版caffe已及最新版cudnn,新版caffe是支持最新版cudnn的,但py-faster-rcnn报错的原因都是因为caffe和cudnn版本各不同引起,或许有大神可以解决,但我强行手动升级没成功,现在告诉大家一种简单粗暴快速的安装方法。

Cuda7.5

cudnn-7.0-linux-x64-v4.0

py-faster-rcnn中自带版本caffe

再按照二中安装py-faster-rcnn方法

就顺利跑通了!

 

参考文献:

http://blog.csdn.net/u012208159/article/details/47018095

http://stackoverflow.com/questions/39099783/fast-r-cnn-caffe-layerparameter-has-no-field-named-roi-pooling-param

http://blog.csdn.net/tmylzq187/article/details/51952847?locationNum=8

https://github.com/BVLC/caffe/issues/3947

http://blog.csdn.net/hpp24/article/details/52192682

http://www.oschina.net/question/565065_115133

http://www.cnblogs.com/bishopmoveon/p/4475036.html

http://blog.csdn.net/vbskj/article/details/52120475

http://blog.csdn.net/kkk584520/article/details/51163564