Nvidia CUDA开发环境准备

Nvidia CUDA开发环境准备

说明

  • 汇总一下环境准备
  • 测试电脑配置
    • Windows 10 21H1
    • 惠普 ProBook G440 G8
      • i7-1165G7
      • 16G内存
      • 512G固态盘
      • Nvidia MX450显卡

资源链接

CUDA

CUDA 是英伟达显卡进行各种高性能运算必备的库,不同软件包依赖的 CUDA 版本可能不同,你需要根据你所使用的深度学习框架选择合适的 CUDA 和 cuDNN 版本。

CUDA兼容性矩阵

CUDA Compatibility

CUDA ToolkitLinux x86_64 Required Driver VersionWindows MinimumRequired Driver Version
CUDA 11.4 Update 2>=470.57.02>=471.41
CUDA 11.4 Update 1>=470.57.02>=471.41
CUDA 11.4.0 GA>=470.42.01>=471.11
CUDA 11.3.1 Update 1>=465.19.01>=465.89
CUDA 11.3.0 GA>=465.19.01>=465.89
CUDA 11.2.2 Update 2>=460.32.03>=461.33
CUDA 11.2.1 Update 1>=460.32.03>=461.09
CUDA 11.2.0 GA>=460.27.03>=460.82
CUDA 11.1.1 Update 1>=455.32>=456.81
CUDA 11.1 GA>=455.23>=456.38
CUDA 11.0.3 Update 1>= 450.51.06>= 451.82
CUDA 11.0.2 GA>= 450.51.05>= 451.48
CUDA 11.0.1 RC>= 450.36.06>= 451.22
CUDA 10.2.89>= 440.33>= 441.22
CUDA 10.1 (10.1.105 general release, and updates)>= 418.39>= 418.96
CUDA 10.0.130>= 410.48>= 411.31
CUDA 9.2 (9.2.148 Update 1)>= 396.37>= 398.26
CUDA 9.2 (9.2.88)>= 396.26>= 397.44
CUDA 9.1 (9.1.85)>= 390.46>= 391.29
CUDA 9.0 (9.0.76)>= 384.81>= 385.54
CUDA 8.0 (8.0.61 GA2)>= 375.26>= 376.51
CUDA 8.0 (8.0.44)>= 367.48>= 369.30
CUDA 7.5 (7.5.16)>= 352.31>= 353.66
CUDA 7.0 (7.0.28)>= 346.46>= 347.62

cuDNN

cuDNN 下载需要注册,地址如下:

cuDNN 是英伟达推出的专门用于深度学习加速计算的库,一般来说比使用纯 CUDA 速度要快不少。比如 Keras 里有普通的 LSTM 以及 CuDNNLSTM,速度相差最高有十倍。如果没有设计特殊的结构,cuDNN 应该是你的首选。

你需要根据你所使用的深度学习框架选择合适的 CUDA 和 cuDNN 版本。

免注册直链下载

举个例子,以 Download cuDNN v7.6.1 (June 24, 2019), for CUDA 10.1 为例,点击 cuDNN Library for Windows 10 下载会跳到登录页面

这时候复制这个URL,然后改动一下,就可以直链下载,记得改成cn域名,下载才快

网页链接:https://developer.nvidia.cn/compute/machine-learning/cudnn/secure/v7.6.1.34/prod/10.1_20190620/cudnn-10.1-windows10-x64-v7.6.1.34.zip

直链链接:https://developer.download.nvidia.cn/compute/redist/cudnn/v7.6.1/cudnn-10.1-windows10-x64-v7.6.1.34.zip

cuDNN兼容性矩阵

cuDNN Support Matrix

硬件要求

Hardware Requirements

软件要求

Software Requirements

安装指南

PyTorch

安装指南

老版本安装

Tensorflow

安装指南

Windows Visual Studio 开发环境

兼容性矩阵

CUDA 10.1.243 为例,Nvidia 官方在这里简单的列了一下 Windows 编译器的版本支持

从表格中可以看到,从 Visual Studio 2012 至 2019 x86_64 都是支持的

CompilerIDENative x86_64Cross (x86_32 on x86_64)
MSVC Version 192xVisual Studio 2019 16.x (Preview releases)YESNO
MSVC Version 191xVisual Studio 2017 15.x (RTW and all updates)YESNO
MSVC Version 1900Visual Studio 2015 14.0 (RTW and updates 1, 2, and 3)YESNO
MSVC Version 1900Visual Studio Community 2015YESNO
MSVC Version 1800Visual Studio 2013 12.0YESYES
MSVC Version 1700Visual Studio 2012 11.0YESYES

安装顺序

最好是先按照 Visual Studio 然后再安装 CUDA,这样会自动注册 CUDA 的扩展插件

下载地址

可以去微软的 Visual Studio 下载页面 下载community版本,免费的,白嫖不香吗。

当前版本是Visual Studio Community 2022,想下载 Visual Studio Community 2019 要另辟蹊径

Visual Studio 2019 的平台目标以及兼容性可以找到三个版本的下载按钮,点击即可下载

或者嫌麻烦,这里也提供直链下载

安装Visual Studio Community

  • 在开始菜单中找到 Visual Studio Installer, 然后打开它
  • 如果是没安装过Visual Studio的,已安装这里会啥都没有,点击可用

  • 可用这里,可以看到有 Enterprise、Professional、Community,白嫖的话,点击Community右侧的安装

  • 左侧窗口找到并勾选使用C++的桌面开发

    • PS:如果有其他组件需求,按需勾选
  • 勾选完之后,点击右下角安装

vcconfig导入配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
{
"version": "1.0",
"components": [
"Component.Microsoft.VisualStudio.LiveShare",
"Microsoft.Component.MSBuild",
"Microsoft.Component.VC.Runtime.UCRTSDK",
"Microsoft.VisualStudio.Component.CoreEditor",
"Microsoft.VisualStudio.Component.Debugger.JustInTime",
"Microsoft.VisualStudio.Component.Graphics.Tools",
"Microsoft.VisualStudio.Component.IntelliCode",
"Microsoft.VisualStudio.Component.JavaScript.TypeScript",
"Microsoft.VisualStudio.Component.NuGet",
"Microsoft.VisualStudio.Component.Roslyn.Compiler",
"Microsoft.VisualStudio.Component.Roslyn.LanguageServices",
"Microsoft.VisualStudio.Component.TextTemplating",
"Microsoft.VisualStudio.Component.TypeScript.4.3",
"Microsoft.VisualStudio.Component.VC.140",
"Microsoft.VisualStudio.Component.VC.ASAN",
"Microsoft.VisualStudio.Component.VC.ATL",
"Microsoft.VisualStudio.Component.VC.CMake.Project",
"Microsoft.VisualStudio.Component.VC.CoreIde",
"Microsoft.VisualStudio.Component.VC.DiagnosticTools",
"Microsoft.VisualStudio.Component.VC.Redist.14.Latest",
"Microsoft.VisualStudio.Component.VC.TestAdapterForBoostTest",
"Microsoft.VisualStudio.Component.VC.TestAdapterForGoogleTest",
"Microsoft.VisualStudio.Component.VC.Tools.x86.x64",
"Microsoft.VisualStudio.Component.VC.v141.x86.x64",
"Microsoft.VisualStudio.Component.VC.v141.x86.x64.Spectre"
"Microsoft.VisualStudio.Component.Windows10SDK.19041",
"Microsoft.VisualStudio.ComponentGroup.NativeDesktop.Core",
"Microsoft.VisualStudio.ComponentGroup.WebToolsExtensions",
"Microsoft.VisualStudio.ComponentGroup.WebToolsExtensions.CMake",
"Microsoft.VisualStudio.Workload.CoreEditor",
"Microsoft.VisualStudio.Workload.NativeDesktop",
]
}

安装CUDA

从这里下载到 CUDA 10.1 update2 Windows 10 的安装包,双击打开安装即可

  • 如果已经有显卡驱动了,可以不勾选显卡驱动

image-20211111171150607

image-20211111171951658

image-20211111172217497

安装cuDNN

这里以 cudnn-v7.6.1.34 for cuda-10.1 windows 10 x64 为例

  • 点击 cudnn-v7.6.1.34 下载zip文件
  • 解压 cudnn-10.1-windows10-x64-v7.6.1.34.zip
  • 文件列表如下
1
2
3
4
5
6
7
8
9
10
11
12
13
CUDNN-10.1-WINDOWS10-X64-V7.6.1.34
\---cuda
| NVIDIA_SLA_cuDNN_Support.txt
|
+---bin
| cudnn64_7.dll
|
+---include
| cudnn.h
|
\---lib
\---x64
cudnn.lib
  • 拷贝解压出来的文件
    • CUDNN-10.1-WINDOWS10-X64-V7.6.1.34\cuda\bin目录里的.dll文件拷贝到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin
    • CUDNN-10.1-WINDOWS10-X64-V7.6.1.34\cuda\include目录里的.h文件拷贝到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include
    • CUDNN-10.1-WINDOWS10-X64-V7.6.1.34\cuda\lib\x64目录里的.lib文件拷贝到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\lib\x64

验证CUDA

运行CUDA示例程序

deviceQuery

以下命令在命令行提示符中执行,带上双引号

1
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\demo_suite\deviceQuery.exe"

返回结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
C:\Users\win10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\demo_suite\deviceQuery.exe"
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\demo_suite\deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce MX450"
CUDA Driver Version / Runtime Version 11.2 / 10.1
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 2048 MBytes (2147483648 bytes)
(14) Multiprocessors, ( 64) CUDA Cores/MP: 896 CUDA Cores
GPU Max Clock rate: 1440 MHz (1.44 GHz)
Memory Clock rate: 3501 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: zu bytes
Total amount of shared memory per block: zu bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: zu bytes
Texture alignment: zu bytes
Concurrent copy and kernel execution: Yes with 6 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 10.1, NumDevs = 1, Device0 = GeForce MX450
Result = PASS

其他程序

都在这个目录C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\demo_suite,可以双击运行来看看

编译测试

在前面安装CUDA的时候,默认就带上了sample,所以这里直接用sample测试

  • win + r 组合键,打开 运行 窗口
  • 输入以下命令,带上双引号,直接打开Visual Studio 2019
1
"C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.1\5_Simulations\nbody\nbody_vs2019.sln"
  • ctrl + F5 组合键,运行之后可以看到nbody的运行窗口

image-20211111191525647