如何在 Linux 上监控内存 IO

如何在 Linux 上监控内存 IO

有很多工具可以用来监控磁盘 IO,比如 dstat。

有没有什么工具可以用来监控 DRAM IO?比如每秒从 DRAM 读取多少 MB 数据。

答案1

由于您使用的是 Intel CPU,因此您应该能够使用处理器计数器监视器,一款现已开源的英特尔软件。如果我没看错的话,在 Linux 上编译它只需要g++和。make

在运行它之前,您需要确保该msr模块已被加载(sudo modprobe msr)或内置。

有了你的 CPU,你应该能够使用该pcm-memory.x实用程序。我无法使用它,所以我不知道输出是什么样子的。

即使你的 CPU 不支持pcm-memory.x,你仍然可以从 获得整体内存带宽统计信息pcm.x。它看起来像这样:

$ sudo ./pcm.x -i=1 -nc

 Processor Counter Monitor  ($Format:%ci ID=%h$)


IBRS and IBPB supported  : no
STIBP supported          : no
Spec arch caps supported : no
Number of physical cores: 4
Number of logical cores: 8
Number of online logical cores: 8
Threads (logical cores) per physical core: 2
Num sockets: 1
Physical cores per socket: 4
Core PMU (perfmon) version: 4
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 3600000000 Hz
Package thermal spec power: 65 Watt; Package minimum power: 0 Watt; Package maximum power: 0 Watt;
Trying to use Linux perf events...
Successfully programmed on-core PMU using Linux perf

Detected Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz "Intel(r) microarchitecture codename Kabylake" stepping 9 microcode level 0x5e

 EXEC  : instructions per nominal CPU cycle
 IPC   : instructions per CPU cycle
 FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
 AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)
 L3MISS: L3 (read) cache misses
 L2MISS: L2 (read) cache misses (including other core's L2 cache *hits*)
 L3HIT : L3 (read) cache hit ratio (0.00-1.00)
 L2HIT : L2 cache hit ratio (0.00-1.00)
 L3MPI : number of L3 (read) cache misses per instruction
 L2MPI : number of L2 (read) cache misses per instruction
 READ  : bytes read from main memory controller (in GBytes)
 WRITE : bytes written to main memory controller (in GBytes)
 IO    : bytes read/written due to IO requests to memory controller (in GBytes); this may be an over estimate due to same-cache-line partial requests
 TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
 energy: Energy in Joules


 Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3MPI | L2MPI |  TEMP

---------------------------------------------------------------------------------------------------------------
 SKT    0     0.02   1.05   0.02    0.39     402 K   1770 K    0.76    0.53    0.00    0.00     67
---------------------------------------------------------------------------------------------------------------
 TOTAL  *     0.02   1.05   0.02    0.39     402 K   1770 K    0.76    0.53    0.00    0.00     N/A

 Instructions retired:  487 M ; Active cycles:  462 M ; Time (TSC): 3602 Mticks ; C0 (active,non-halted) core residency: 4.12 %

 C1 core residency: 9.26 %; C3 core residency: 0.59 %; C6 core residency: 2.14 %; C7 core residency: 83.89 %;
 C0 package residency: 36.94 %; C2 package residency: 63.06 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %; C8 package residency: 0.00 %; C9 package residency: 0.00 %; C10 package residency: 0.00 %;
                             ┌───────────────────────────────────────────────────────────────────────────────┐
 Core    C-state distribution│0001111111667777777777777777777777777777777777777777777777777777777777777777777│
                             └───────────────────────────────────────────────────────────────────────────────┘
                             ┌────────────────────────────────────────────────────────────────────────────────┐
 Package C-state distribution│00000000000000000000000000000022222222222222222222222222222222222222222222222222│
                             └────────────────────────────────────────────────────────────────────────────────┘

 PHYSICAL CORE IPC                 : 2.11 => corresponds to 52.65 % utilization for cores in active state
 Instructions per nominal CPU cycle: 0.03 => corresponds to 0.85 % core utilization over time interval
 SMI count: 0
---------------------------------------------------------------------------------------------------------------
MEM (GB)->|  READ |  WRITE |   IO   | CPU energy |
---------------------------------------------------------------------------------------------------------------
 SKT   0     0.24     0.03     0.00       1.88
---------------------------------------------------------------------------------------------------------------
Cleaning up
 Zeroed uncore PMU registers

除非您指定-i=1,否则输出将定期重复。如果您省略-nc,您将获得每个核心的执行统计数据,而不仅仅是总数。

在底部,您可以看到内存统计信息。

相关内容