项目场景:paddlepaddle FatalError: Segmentation fault
is detected by the operating system.
paddlepaddle cpu运行infer.py正常 gpu运行infer.py报错
# 问题描述:
环境
paddlepaddle-gpu 2.1.0.post101
python 3.8.5
cuda 10.1
cudnn 8.0.5
C++ Traceback (most recent call last):
--------------------------------------
0 paddle::framework::SignalHandle(char const*, int)
1 paddle::platform::GetCurrentTraceBackString[abi:cxx11]()
----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
[TimeInfo: *** Aborted at 1623290314 (unix time) try "date -d @1623290314" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 23335 (TID 0x7f2ee0a14700) from PID 0 ***]
Segmentation fault (core dumped)
单独执行infer运行正常,放到项目中报错
原因分析:
1.首先打开infer.py日志
找到PaddleDetection/deploy/python/infer.py
注释config.disable_glog_info()
2.再次运行
W0610 09:58:33.832181 23452 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 10.1, Runtime API Version: 10.1
W0610 09:58:33.833010 23452 device_context.cc:422] device: 0, cuDNN Version: 8.0.
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 paddle::framework::SignalHandle(char const*, int)
1 paddle::platform::GetCurrentTraceBackString[abi:cxx11]()
----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
[TimeInfo: *** Aborted at 1623290314 (unix time) try "date -d @1623290314" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 23335 (TID 0x7f2ee0a14700) from PID 0 ***]
Segmentation fault (core dumped)
cudnn版本不兼容,装7.6.5
解决方案:
1.去官网下载cudnn
https://developer.nvidia.com/rdp/cudnn-archive
下载这三个,根据cuda和服务器版本下载
2.安装
#依次安装
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb
#官方说法:To verify that cuDNN is installed and is running properly, compile the mnistCUDNN sample located in the /usr/src/cudnn_samples_v8 directory in the debian file.
#0. Copy the cuDNN sample to a writable path.
cp -r /usr/src/cudnn_samples_v7/ $HOME
#Go to the writable path.
cd ~/cudnn_samples_v7/mnistCUDNN
#2. Compile the mnistCUDNN sample.
#编译文件。
sudo make clean
sudo make
3. Run the mnistCUDNN sample.
运行样例程序。
sudo ./mnistCUDNN
4. If cuDNN is properly installed and running on your Linux system, you will see a message similar to the following:
如果成功运行,会显示下列信息:
#查看cudnn版本
cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2
再次运行 正常
clude/cudnn.h | grep CUDNN_MAJOR -A 2
再次运行 正常
<hr style=" border:solid; width:100px; height:1px;" color=#000000 size=1">
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)