每个 PID 在 GPU 上使用多少内存?

每个 PID 在 GPU 上使用多少内存?

我如何知道每个进程使用了​​ GPU 上的多少内存,CUDA 使用 1 个处理器编译代码,访问 HPC 上 16 核节点上可用的 2 个 M2090 Tesla 卡中的 1 个?

  ~/code_dev/gcode/projects/gcode_nbd_csu$cat /proc/cpuinfo | grep processor
  processor                                        : 0
  processor                                        : 1
  processor                                        : 2
  processor                                        : 3
  processor                                        : 4
  processor                                        : 5
  processor                                        : 6
  processor                                        : 7
  processor                                        : 8
  processor                                        : 9
  processor                                        : 10
  processor                                        : 11
  processor                                        : 12
  processor                                        : 13
  processor                                        : 14
  processor                                        : 15

[~/code_dev/gcode/projects/gcode_nbd_csu]$lspci | grep -irn nvidia
26:02:00.0 3D controller: NVIDIA Corporation Tesla M2090 (rev a1)
91:84:00.0 3D controller: NVIDIA Corporation Tesla M2090 (rev a1)

top - 06:22:09 up 91 days, 17:26,  1 user,  load average: 4.09, 4.02, 3.90
Tasks: 389 total,   5 running, 384 sleeping,   0 stopped,   0 zombie
Cpu(s): 13.1%us,  1.4%sy,  0.0%ni, 85.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32843408k total, 11775040k used, 21068368k free,   196700k buffers
Swap: 64002952k total,        0k used, 64002952k free,  8265528k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15385 duser1    20   0 52.1g  31m  22m R 100.1  0.1  38:49.07 gcode_1.04
24712 otherguy  20   0 53.1g 979m 125m R 100.1  3.1  68557:48 gcode_62_0
25530 otherguy  20   0 53.1g 977m 126m R 100.1  3.0  28573:45 gcode_62_1
15985 otherguy  20   0 53.1g 980m 126m R 98.1  3.1  40051:48 gcode_62_1
    1 root      20   0 19360 1536 1224 S  0.0  0.0   0:01.71 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kthreadd
    3 root      RT   0     0    0    0 S  0.0  0.0   0:27.18 migration/0

答案1

有一个名为 的工具nvidia-smi。它随 nVidia CUDA 驱动程序一起提供。它会告诉您哪个 PID 正在使用 GPU:

[root@localhost release]# nvidia-smi 
Wed Sep 26 23:16:16 2012       
+------------------------------------------------------+                       
| NVIDIA-SMI 3.295.41   Driver Version: 295.41         |                       
|-------------------------------+----------------------+----------------------+
| Nb.  Name                     | Bus Id        Disp.  | Volatile ECC SB / DB |
| Fan   Temp   Power Usage /Cap | Memory Usage         | GPU Util. Compute M. |
|===============================+======================+======================|
| 0.  Tesla C2050               | 0000:05:00.0  On     |         0          0 |
|  30%   62 C  P0    N/A /  N/A |   3%   70MB / 2687MB |   44%     Default    |
|-------------------------------+----------------------+----------------------|
| Compute processes:                                               GPU Memory |
|  GPU  PID     Process name                                       Usage      |
|=============================================================================|
|  0.  7336     ./align                                                 61MB  |
+-----------------------------------------------------------------------------+

从:https://stackoverflow.com/a/12608682/2992519

相关内容