RTX3090 与pytorch对应版本的安装问题汇总

2023-11-15

一、Linux查看CUDA版本以及cudnn版本号

1、查看CUDA版本
方法1: 查看文件

cat  /usr/local/cuda/version.txt

方法2: 命令

nvcc --version

2、查看cudnn版本

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

注意：如果系统本身没安装cuda和cudnn,无法进行查询；实际使用过程中，发现在创建虚拟环境后，安装特定cuda版本的pytorch时会自带cuda和cudnn，这点需要关注。

二、RTX3090 与pytorch版本对应关系

1、在RTX 3090 上判断，当前版本的的torch版本是否可以用，一般需要通过如下方式：

（1）python 进入python 环境，　import torch 导入torch 安装包；
（2）测试　torch.cuda.is_available(),
（3）测试 torch.zeros(1).cuda()

才能说明当前版本的cuda　可以调用当前版本的pytorch
问题现象：

>>> torch.zeros(1).cuda()
/home/respecting/anaconda3/envs/torch1.8.1/lib/python3.7/site-packages/torch/cuda/__init__.py:104: UserWarning: 
NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

问题分析：
（1）表明当前的安装的pytorch 版本没有匹配上合适的cuda, 即当前pytorch 版本的 cuda 版本没有对应到自己主机上安装的cuda版本，
（2）pytorch 环境中安装的cuda 版本，需要满足以下两个条件：
1、当前pytorch版本的算力支持当前机器上显卡的算力；
2、pytorch 中的 cuda 版本不能高于当前机器上已经安装的 cuda版本；

解决问题：
知道了问题的原因之后，　我们便可以解决了：
1、RTX3090 至少需要cuda 11.1 版本，才能够驱动该设备, 故我们可以安装cuda11.1 以上版本
所以在想要安装的 pytorch 版本中，找到大于cuda11.1 <= pytorch-cuda --version <= 当前机器上安装的 cuda --version
在这里插入图片描述
可以发现：本机当前支持的最高cuda为11.1，因此，这里最好安装cuda11.1的pytorch

三、安装匹配版本的pytorch

torch地址：
https://download.pytorch.org/whl/cu111/torch/
torchvision地址：
https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud//pytorch/linux-64/

找到对应python版本的torch：
torch-1.10.2+cu111-cp37-cp37m-linux_x86_64.whl
找到对应版本的torchvision:
torchvision-0.11.1-py37_cu111.tar.bz2

分步安装torch和torchvision:
pip install torch-1.10.2+cu111-cp37-cp37m-linux_x86_64.whl
conda install torchvision-0.11.1-py37_cu111.tar.bz2
pip install numpy
pip install Pillow

验证：
Python 3.7.13 (default, Mar 29 2022, 02:18:16) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.zeros(1).cuda()
tensor([0.], device='cuda:0')
>>>

出现这表示验证通过

四、查看PyTorch的版本及CUDA和cuDNN版本

检查PyTorch版本

torch.version # PyTorch version

torch.version.cuda # Corresponding CUDA version

torch.backends.cudnn.version() # Corresponding cuDNN version

torch.cuda.get_device_name(0) # GPU type

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)