安装Nvidia显卡驱动和CUDA

2023-05-16

原链接: http://community.bwbot.org/topic/152

网上看到的,但是原链接
不过这里安装的是CUDA7.5,现在最新的是8.0。可以到官网进行下载,记住一定不要选择deb方式,会出问题。用run文件最好了。如果你已经安装过驱动了,一定在安装CUDA的时候选择不要安装驱动,否则系统的显卡驱动会出问题。

In this article, I will share some of my experience on installing NVIDIA driver and CUDA on Linux OS. Here I mainly use Ubuntu as example. Comments for CentOS/Fedora are also provided as much as I can.

Table of Contents

Install NVIDIA Graphics Driver via apt-get
Install NVIDIA Graphics Driver via runfile
Remove Previous Installations (Important)
Download the Driver
Install Dependencies
Creat Blacklist for Nouveau Driver
Stop lightdm/gdm/kdm
Excuting the Runfile
Check the Installation
Common Errors and Solutions
Additional Notes
Install CUDA
Install cuDNN
Table of contents generated with markdown-toc

Install NVIDIA Graphics Driver via apt-get

In Ubuntu systems, drivers for NVIDIA Graphics Cards are already provided in the official repository. Installation is as simple as one command.

For ubuntu 14.04.5 LTS, the latest version is 352. To install the driver, excute sudo apt-get nvidia-352 nvidia-modprobe, and then reboot the machine.

For ubuntu 16.04.1 LTS, the latest version is 361. To install the driver, excute sudo apt-get nvidia-361 nvidia-modprobe, and then reboot the machine.

The nvidia-modprobe utility is used to load NVIDIA kernel modules and create NVIDIA character device files automatically everytime your machine boots up.

It is recommended for new users to install the driver via this way because it is simple. However, it has some drawbacks:

The driver included in official Ubuntu repository is usually not the latest.
There would be some naming conflicts when other repositories (e.g. ones from CUDA) are added to the system.
One has to reinstall the driver after Linux kernel are updated.
Install NVIDIA Graphics Driver via runfile

For advanced user who wants to get the latest version of the driver, get rid of the reinstallation issue caused bby dkms, or using Linux distributions that do not have nvidia drivers provided in the repositories, installing from runfile is recommended.

Remove Previous Installations (Important)

One might have installed the driver via apt-get. So before reinstall the driver from runfile, uninstalling previous installations is required. Executing the following scripts carefully one by one.

sudo apt-get purge nvidia*

# Note this might remove your cuda installation as well
sudo apt-get autoremove 

# Recommended if .deb files from NVIDIA were installed
# Change 1404 to the exact system version or use tab autocompletion
# After executing this file, /etc/apt/sources.list.d should contain no files related to nvidia or cuda
sudo dpkg -P cuda-repo-ubuntu1404

Download the Driver

The latest driver for NVIDIA products can always be fetched from NVIDIA’s official website. It is not necessary to select all terms carefully. The driver provided for the same Product Series and Operating System is generally the same. For example, in order to find a driver for a GTX TITAN X graphics card, selecting GeForce 900 Series in Product Series and Linux 64-bit in Operating System is enough.

If you want to down load the driver directly in a Linux shell, the script below would be useful.

cd ~
wget http://us.download.nvidia.com/XFree86/Linux-x86_64/367.57/NVIDIA-Linux-x86_64-367.57.run
Detailed installation instruction can be found in the download page via a README hyperlink in the ADDITIONAL INFORMATION tab. I have also summarized key steps below.

Install Dependencies

Software required for the runfile are officially listed here. But this page seems to be stale and not easy to follow.

For Ubuntu, installing the following dependencies is enough.
build-essential – For building the driver
gcc-multilib – For providing 32-bit support
dkms – For providing dkms support
(Optional) xorg and xorg-dev. On a workstation with GUI, this is require but usually have already been installed, because you have already got the graphic display. On headless servers without GUI, this is not a must.
As a summary, excuting sudo apt-get install build-essential gcc-multilib dkms to install all dependencies.

Required packages for CentOS are epel-release dkms libstdc++.i686. Execute yum install epel-release dkms libstdc++.i686.

Required packages for Fedora are dkms libstdc++.i686 kernel-devel. Execute dnf install dkms libstdc++.i686 kernel-devel.

Creat Blacklist for Nouveau Driver

Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:

blacklist nouveau
options nouveau modeset=0
Note: It is also possible for the NVIDIA installation runfile to creat this blacklist file automatically. Excute the runfile and follow instructions when an error realted Nouveau appears.

Then,

for Ubuntu 14.04 LTS, reboot the computer;
for Ubuntu 16.04 LTS, excute sudo update-initramfs -u and reboot the computer;
for CentOS/Fedora, excute sudo dracut –force and reboot the computer.
Stop lightdm/gdm/kdm

After the computer is rebooted. We need to stop the desktop manager before excuting the runfile to install the driver. lightdm is the default desktop manager in Ubuntu. If GNOME or KDE desktop environment is used, installed desktop manager will then be gdm or kdm.

For Ubuntu 14.04 / 16.04, excuting sudo service lightdm stop (or use gdm or kdm instead of lightdm)
For Ubuntu 16.04 / Fedora / CentOS, excuting sudo systemctl stop lightdm (or use gdm or kdm instead of lightdm)
Excuting the Runfile

After above batch of preparition, we can eventually start excuting the runfile. So this is why I, from the very begining, recommend new users to install the driver via apt-get.

cd ~
chmod +x NVIDIA-Linux-x86_64-367.57.run
sudo ./NVIDIA-Linux-x86_64-367.57.run --dkms -s

Note:

option –dkms is used for register dkms module into the kernel so that update of the kernel will not require a reinstallation of the driver. This option should be turned on by default.
option -s is used for silent installation which should used for batch installation. For installation on a single computer, this option should be turned off for more installtion information.
option –no-opengl-files can also be added if non-NVIDIA (AMD or Intel) graphics are used for display while NVIDIA graphics are used for display.
The installer may prompt warning on a system without X.Org installed. It is safe to ignore that based on my experience.
WARNING: nvidia-installer was forced to guess the X library path ‘/usr/lib’ and X module path ‘/usr/lib/xorg/modules’; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the pkg-config utility and the X.Org SDK/development package for your distribution and reinstall the driver.
Check the Installation

After a succesful installation, nvidia-smi command will report all your CUDA-capable devices in the system.

Common Errors and Solutions

ERROR: Unable to load the ‘nvidia-drm’ kernel module.
One probable reason is that the system is boot from UEFI but Secure Boot option is turned on in the BIOS setting. Turn it off and the problem will be solved.
Additional Notes

nvidia-smi -pm 1 can enable the persistent mode, which will save some time from loading the driver. It will have significant effect on machines with more than 4 GPUs.

nvidia-smi -e 0 can disable ECC on TESLA products, which will provide about 1/15 more video memory. Reboot is reqired for taking effect. nvidia-smi -e 1 can be used to enable ECC again.

nvidia-smi -pl can be used for increasing or decrasing the TDP limit of the GPU. Increasing will encourage higher GPU Boost frequency, but is somehow DANGEROUS and HARMFUL to the GPU. Decreasing will help to same some power, which is useful for machines that does not have enough power supply and will shutdown unintendedly when pull all GPU to their maximum load.

-i can be added after above commands to specify individual GPU.

These commands can be added to /etc/rc.local for excuting at system boot.

Install CUDA

Installing CUDA from runfile is much simpler and smoother than installing the NVIDIA driver. It just involves copying files to system directories and has nothing to do with the system kernel or online compilation. Removing CUDA is simply removing the installation directory. So I personally does not recommend adding NVIDIA’s repositories and install CUDA via apt-get or other package managers as it will not reduce the complexity of installation or uninstallation but increase the risk of messing up the configurations for repositories.

The CUDA runfile installer can be downloaded from NVIDIA’s websie. But what you download is a package the following three components:

an NVIDIA driver installer, but usually of stale version;
the actual CUDA installer;
the CUDA samples installer;
To extract above three components, one can execute the runfile installer with –extract option. Then, executing the second one will finish the CUDA installation. Installation of the samples are also recommended because useful tool such as deviceQuery and p2pBandwidthLatencyTest are provided.

Scripts for installing CUDA Toolkit are summarized below.

cd ~
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run
chmod +x cuda_7.5.18_linux.run
./cuda_7.5.18_linux.run --extract=$HOME
sudo ./cuda-linux64-rel-7.5.18-19867135.run

After the installation finishes, configure runtime library.

sudo bash -c "echo /usr/local/cuda/lib64/ > /etc/ld.so.conf.d/cuda.conf"
sudo ldconfig

It is also recommended for Ubuntu users to append string /usr/local/cuda/bin to system file /etc/environments so that nvcc will be included in $PATH. This will take effect after reboot.

Install cuDNN

The recommended way for installing cuDNN is to first copy the tgz file to /usr/local and then extract it, and then remove the tgz file if necessary. This method will preserve symbolic links. At last, execute sudo ldconfig to update the shared library cache.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

安装Nvidia显卡驱动和CUDA 的相关文章

  • Golang调用CUDA库

    我正在尝试从 Go 代码中调用 CUDA 函数 我有以下三个文件 test h int test add void test cu global void add int a int b int c c a b int test add v
  • CUDA:如何在设备上填充动态大小的向量并将其内容返回到另一个设备函数?

    我想知道哪种技术可以填充设备上的动态大小数组 int row 在下面的代码中 然后返回其内容 以供另一个设备函数使用 为了将问题置于上下文中 下面的代码尝试使用在 GPU 上运行的高斯 勒让德求积来跨越勒让德多项式基组中的任意函数 incl
  • 用于类型比较的 Boost 静态断言

    以下问题给我编译器错误 我不知道如何正确编写它 struct FalseType enum value false struct TrueType enum value true template
  • 如何将CUDA时钟周期转换为毫秒?

    我想用一些代码来测量时间within我的内核需要 我已经关注了这个问题 https stackoverflow com questions 11209228 timing different sections in cuda kernel连
  • cudaMemcpyToSymbol 的问题

    我正在尝试复制到恒定内存 但我不能 因为我对 cudaMemcpyToSymbol 函数的用法有误解 我正在努力追随this http developer download nvidia com compute cuda 4 1 rel t
  • CUDA 中的广义霍夫变换 - 如何加快分箱过程?

    正如标题所示 我正在对并行计算机视觉技术进行一些个人研究 使用 CUDA 我尝试实现 GPGPU 版本的霍夫变换 我遇到的唯一问题是在投票过程中 我调用atomicAdd 来防止多个同时写入操作 但我似乎没有获得太多的性能效率 我在网上搜索
  • 为什么numba cuda调用几次后运行速度变慢?

    我正在尝试如何在 numba 中使用 cuda 然而我却遇到了与我预想不同的事情 这是我的代码 from numba import cuda cuda jit def matmul A B C Perform square matrix m
  • 传递给 CUDA 的结构中的指针

    我已经搞砸了一段时间了 但似乎无法正确处理 我正在尝试将包含数组的对象复制到 CUDA 设备内存中 然后再复制回来 但当我遇到它时我会跨过那座桥 struct MyData float data int dataLen void copyT
  • 如何优化这个 CUDA 内核

    我已经分析了我的模型 似乎该内核约占我总运行时间的 2 3 我一直在寻找优化它的建议 代码如下 global void calcFlux double concs double fluxes double dt int idx blockI
  • Nvcc 的版本与 CUDA 不同

    我安装了 cuda 7 但是当我点击 nvcc version 时 它打印出 6 5 我想在 GTX 960 卡上安装 Theano 库 但它需要 nvcc 7 0 我尝试重新安装cuda 但它没有更新nvcc 当我运行 apt get i
  • 如何在 CUDA 中执行多个矩阵乘法?

    我有一个方阵数组int M 10 以便M i 定位第一个元素i th 矩阵 我想将所有矩阵相乘M i 通过另一个矩阵N 这样我就收到了方阵数组int P 10 作为输出 我看到有不同的可能性 分配不同元素的计算M i 到不同的线程 例如 我
  • 设置最大 CUDA 资源

    我想知道是否可以设置 CUDA 应用程序的最大 GPU 资源 例如 如果我有一个 4GB GPU 但希望给定的应用程序只能访问 2GB 如果它尝试分配更多 就会失败 理想情况下 这可以在进程级别或 CUDA 上下文级别上设置 不 目前没有允
  • 尝试构建我的 CUDA 程序时出现错误 MSB4062

    当我尝试构建我的第一个 GPU 程序时 出现以下错误 有什么建议可能会出什么问题吗 错误 1 错误 MSB4062 Nvda Build CudaTasks SanitizePaths 任务 无法从程序集 C Program 加载 文件 M
  • TensorRT 多线程

    我正在尝试使用 python API 来使用 TensorRt 我试图在多个线程中使用它 其中 Cuda 上下文与所有线程一起使用 在单个线程中一切正常 我使用 docker 和 tensorrt 20 06 py3 图像 onnx 模型和
  • 有没有一种有效的方法来优化我的序列化代码?

    这个问题缺乏细节 因此 我决定创建另一个问题而不是编辑这个问题 新问题在这里 我可以并行化我的代码吗 还是不值得 https stackoverflow com questions 17937438 can i parallelize my
  • Linux 上的 OpenCL 编译

    我是 OpenCL 的新手 从昨天开始 我尝试使用 OpenCL 进行并行编程 而不是使用我更熟悉且以前体验过的 CUDA 现在我有 NVIDIA GTX 580 GPU Ubuntu Linux 12 04 操作系统和 CUDA SDK
  • 从 CUDA 设备写入输出文件

    我是 CUDA 编程的新手 正在将 C 代码重写为并行 CUDA 新代码 有没有一种方法可以直接从设备写入输出数据文件 而无需将数组从设备复制到主机 我假设如果cuPrintf存在 一定有地方可以写一个cuFprintf 抱歉 如果答案已经
  • 如何为 CUDA 内核选择网格和块尺寸?

    这是一个关于如何确定CUDA网格 块和线程大小的问题 这是对已发布问题的附加问题here https stackoverflow com a 5643838 1292251 通过此链接 talonmies 的答案包含一个代码片段 见下文 我
  • 内联 PTX 汇编代码强大吗?

    我看到一些代码示例 人们在 C 代码中使用内联 PTX 汇编代码 CUDA工具包中的文档提到PTX很强大 为什么会这样呢 如果我们在 C 代码中使用这样的代码 我们会得到什么好处 内联 PTX 使您可以访问未通过 CUDA 内在函数公开的指
  • 如何使用 Tensorflow-GPU 和 Keras 修复低易失性 GPU-Util?

    我有一台 4 GPU 机器 在上面运行带有 Keras 的 Tensorflow GPU 我的一些分类问题需要几个小时才能完成 nvidia smi returns Volatile GPU Util which never exceeds

随机推荐

  • 【C++】6.网络编程:socket实现通信(文字、语音)

    常见的通信方式有文本 语音 xff0c 下面用C 43 43 实现 xff1a 参考 xff1a https blog csdn net Robot hfut article details 102862052 https blog csd
  • 【ros】6.ros激光雷达SLAM(建图定位)

    百行业为先 xff0c 万恶懒为首 梁启超 文章目录 smirk 1 激光SLAM blush 2 二维激光SLAM satisfied 3 三维激光SLAM x1f60f 1 激光SLAM SLAM xff08 同步定位与地图构建 xff
  • 【ros】7.ros导航navigation(定位规划)

    物竞天择 xff0c 优胜劣汰 xff1b 苟不自新 xff0c 何以获存 梁启超 文章目录 smirk 1 ros导航 blush 2 2d导航 satisfied 3 3d导航 x1f60f 1 ros导航 ros机器人有个导航功能 x
  • 【两周年】我的创作纪念日(水)

    机缘 两年前的今天 xff0c 处于离职状态 xff0c 准备去另一个城市工作 xff0c 同时开始学习编程知识 IT技能 xff0c CSDN让我发现了一群热爱学习和分享的小伙伴 xff0c 也萌发了在这里扎根的想法 收获 不知不觉已经两
  • AI模型部署概述

    心口如一 xff0c 犹不失为光明磊落丈夫之行也 梁启超 文章目录 smirk 1 AI模型部署方法 blush 2 AI模型部署框架ONNXNCNNOpenVINOTensorRTMediapipe如何选择 satisfied 3 AI模
  • 【C++】1.语言基础:八股文

    心口如一 xff0c 犹不失为光明磊落丈夫之行也 梁启超 文章目录 smirk 1 语言基础内存分配指针参数传递和引用参数传递四种强制转换面向对象的三大特性并举例 define 和别名 typedef 的区别 blush 2 标准库STL介
  • 【VSLAM】ORB-SLAM3安装部署与运行

    心口如一 xff0c 犹不失为光明磊落丈夫之行也 梁启超 文章目录 smirk 1 ORB SLAM3介绍 blush 2 代码安装部署1 安装ros与opencv2 安装Pangolin作为可视化和用户界面3 安装Eigen3一个开源线性
  • 【Linux运维】ACPI BIOS Error问题解决

    今天帮朋友装个ubuntu系统 xff0c 遇到一个问题记录一下 报错与现象 xff1a ACPI BIOS Error 电脑花屏 解决方法 xff1a 插入启动盘 xff0c 当进入引导界面后 xff0c 键盘输入 e xff0c 编辑L
  • catkin_make的时候发生了什么

    原链接http community bwbot org topic 182 运行测试平台 小强ROS机器人 这是一个比较复杂的问题 xff0c 但是有时候会有莫名其妙的编译错误 xff0c 在找错误的过程中会非常需要了解这个过程 首先说一下
  • 【ros】8.有限状态机

    心口如一 xff0c 犹不失为光明磊落丈夫之行也 梁启超 文章目录 smirk 1 有限状态机认识 blush 2 一个简单的示例 satisfied 3 自动驾驶如何用有限状态机 x1f60f 1 有限状态机认识 有限状态机 xff08
  • 【C++】8.编译:CMake工具入门

    x1f60f xffe3 xffe3 x1f60f 这篇文章主要介绍CMake工具的入门使用 学其所用 xff0c 用其所学 梁启超 欢迎来到我的博客 xff0c 一起学习知识 xff0c 共同进步 x1f95e 喜欢的朋友可以
  • lwip --- (十六)TCP建立流程

    这一节我们就看看如何在我们的LWIP上实现一个http服务器的过程 xff0c 结合连接建立过程来理解TCP状态转换图和TCP控制块中各个字段的意义 这里先讲解一些与TCP相关的最基础的函数 xff0c 至于是怎样将这些函数合理高效的组织起
  • TCP和串口间的互相通信(透传)

    TCP和串口间的互相通信 xff08 透传 xff09 Tcp作为服务端 xff0c 接收消息 xff0c 通过串口发送 span class token keyword private span span class token retu
  • CONTINUING||重启

    现在是20年的8月13日 这是一个让自己非常难忘的一天 此时的我已经实现了当时自己曾经许下的诺言 xff0c 实现了自己当时年少无知的梦想 找到了一个好公司 xff0c 有了一份好工作 xff08 tx xff09 但是这不是自己的梦想的终
  • c语言| |strstr函数的源代码以及自我实现

    strstr函数 strstr函数 xff1a strstr str1 str2 函数用于判断字符串str2是否是str1的子串 如果是 xff0c 则该函数返回str2在str1中首次出现的地址 xff1b 否则 xff0c 返回NULL
  • 如何在Linux下用vim编写代码

    1 首先进入到一个目录下 xff0c 输入命令 vim test c 2 便会在该目录下 xff0c 创建一个test c xff08 test c不存在 xff09 的文件 xff0c 如果test c存在的话 xff0c 那么就打开该文
  • C语言| |c语言下如何输出彩色的字

    c语言下如何输出彩色的字 使用格式 xff1a 样式开始 43 被修饰字符串 43 样式结束 样式开始 xff1a 033 43 参数1 43 xff1a 43 参数2 43 xff1a 43 参数3 43 m 参数1 xff1a 代表背景
  • Linux| |对于UDP的学习

    UDP 前序 UDP xff08 用户数据报协议 xff09 没有连接的 xff0c 是面向数据报的 xff0c 是不可靠 套接字 就是IP地址 43 端口号 IP地址 xff1a 4字节 端口号 xff1a 2字节 xff0c 也就是说范
  • 数据结构| |各类排序的时间复杂度以及稳定性

    各类排序的时间复杂度以及稳定性 插入排序 xff1a 直接插入排序 xff1a O N 2 稳定 希尔排序 xff1a O N 1 3 不稳定 选择排序 xff1a 选择排序 xff1a O N 2 不稳定 堆排序 xff1a O Nlog
  • 安装Nvidia显卡驱动和CUDA

    原链接 http community bwbot org topic 152 网上看到的 xff0c 但是原链接 不过这里安装的是CUDA7 5 xff0c 现在最新的是8 0 可以到官网进行下载 xff0c 记住一定不要选择deb方式 x