我刚刚全新安装了 Ubuntu 20.04,并尝试在 AMD Radeon R7 M445 上安装 ROCm 进行 GPU 计算。我需要它来使用 tensorflow 的 GPU 功能。我遵循了本教程:安装 AMD ROCm — ROCm 文档 1.0.0 文档。不确定 ROCm 是否支持此 GPU,因为我在几个地方看到不同的说法。
ROCm 版本是 3.7.0-20。
Linux 内核版本是 5.4.0-42-generic。
运行该命令时sudo apt install rocm-dkms
,终端上的输出是:
Secure Boot not enabled on this system. Done. Forcing installation of amdgpu
amdgpu.ko: Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-42-generic/updates/dkms/
amdttm.ko: Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-42-generic/updates/dkms/
amdkcl.ko: Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-42-generic/updates/dkms/
amd-sched.ko: Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/5.4.0-42-generic/updates/dkms/
depmod....
Backing up initrd.img-5.4.0-42-generic to /boot/initrd.img-5.4.0-42-generic.old-dkms Making new initrd.img-5.4.0-42-generic (If next boot fails, revert to initrd.img-5.4.0-42-generic.old-dkms image) update-initramfs......
DKMS: install completed.
运行时我得到的输出opt/rocm/bin/rocminfo
是:
ROCk module is loaded
Able to open /dev/kfd read-write
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
Uuid: CPU-XX
Marketing Name: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3500
BDFID: 0
Internal Node ID: 0
Compute Unit: 4
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 16294032(0xf8a090) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16294032(0xf8a090) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*** Done ***
我/opt/rocm/opencl/bin/clinfo
得到:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.0 AMD-APP (3182.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 0
你知道问题是什么吗?我尝试了网上看到的很多东西,更新了我能想到或找到的所有内容(OpenCL、OpenGL)。安装的 GPU 驱动程序是“amdgpu”。
答案1
同事的笔记本电脑的 dGPU 与 R7 M445 类似(我认为是 M255),但运行效果非常差。在 18.04 中甚至无法使用。现在他可以使用 20.04,但它并不比英特尔 iGPU 好多少。
这是针对开源 AMDGPU(非专业)驱动程序的。AMDGPU-PRO 驱动程序适用于 16.04.x(x 不是最新版本),并且它确实支持一些 OpenCL iirc)
关于 ROCm - 你运气不好。实际上只有少数显卡受支持 - 主要是 Polaris(例如 RX 580)和桌面 Vega(例如 Vega 64)。旧显卡(例如 R9 270)根本不受支持。dGPU 不受支持(例如你的 M445)这真是一团糟。