
我需要创建一个折线图来显示每个进程的 CPU 使用率随时间的变化。如何创建以时间为 X 轴、以 %CPU 为 Y 轴的图表,然后使用命令名称指示数据属于图表上的哪条线?
我的数据是使用 unix 命令创建的:
pidstat -hdul 1 > 文件.txt
sed'1d;/^[#]/d;/^$/d;s/^[ ]*//;s/[ ]+/,/g'raw_data_file_input>nice_data_file.csv
我的数据分为以下列:
时间、PID、%usr、%system、%guest、%CPU、CPU、KB_rd/s、KB_wr/s、KB_ccwr/s、命令
换句话说,我希望每个命令“kjournald”在不同时间都是图表上的一条线,而“gnome-panel”则是另一条单独的线。
以下是 csv 格式的数据示例:
1320713878,680,0.00,0.00,0.00,0.00,0,0.00,35.64,0.00,kjournald
1320713878,2831,1.98,1.98,0.00,3.96,0,0.00,0.00,0.00,/usr/bin/X,:0,-br,-verbose,-auth,/var/run/gdm/auth-for-gdm-LiEP18/database,-nolisten,tcp,vt7,
1320713878,4360,0.00,1.98,0.00,1.98,0,0.00,0.00,0.00,gnome-terminal,
1320713878,7897,1.98,0.00,0.00,1.98,0,0.00,0.00,0.00,gnome-panel,
1320713878,24834,0.00,0.99,0.00,0.99,0,0.00,0.00,0.00,networking,networking,file:///usr/local/src/opensplice/install/HDE/x86.linux2.6/etc/config/ospl.xml,
1320713878,24986,0.00,1.98,0.00,1.98,1,0.00,0.00,0.00,pidstat,-hdul,1,
1320713879,2426,1.00,1.00,0.00,2.00,3,0.00,0.00,0.00,/usr/bin/prltoolsd,-p,/var/run/prltoolsd.pid,
1320713879,2831,2.00,1.00,0.00,3.00,2,0.00,4.00,0.00,/usr/bin/X,:0,-br,-verbose,-auth,/var/run/gdm/auth-for-gdm-LiEP18/database,-nolisten,tcp,vt7,
1320713879,7904,14.00,0.00,0.00,14.00,1,0.00,0.00,0.00,nautilus,--no-desktop,--browser,
1320713879,24834,0.00,1.00,0.00,1.00,0,0.00,0.00,0.00,networking,networking,file:///usr/local/src/opensplice/install/HDE/x86.linux2.6/etc/config/ospl.xml,
1320713879,24992,0.00,2.00,0.00,2.00,0,0.00,0.00,0.00,/bin/sh,./killAll.sh,
1320713880,2831,0.00,1.00,0.00,1.00,1,0.00,0.00,0.00,/usr/bin/X,:0,-br,-verbose,-auth,/var/run/gdm/auth-for-gdm-LiEP18/database,-nolisten,tcp,vt7,
1320713880,3466,0.00,1.00,0.00,1.00,2,0.00,0.00,0.00,/usr/sbin/nscd,
1320713880,4129,0.00,2.00,0.00,2.00,0,0.00,0.00,0.00,/usr/bin/prl_wmouse_d,-d,
1320713880,24986,0.00,2.00,0.00,2.00,2,0.00,0.00,0.00,pidstat,-hdul,1,
1320713880,24992,0.00,2.00,0.00,2.00,3,0.00,0.00,0.00,/bin/sh,./killAll.sh,
答案1
我不知道您对 Excel 的“数据”选项卡上的“文本到列”工具有多熟悉,但您可以使用它来快速拆分逗号分隔的文件。
我可能漏掉了一些东西,但在我看来,您的示例数据中每个“命令”只有一个实例。我无法用单个数据点构建时间相关的折线图,因此我制作了一些额外的“虚拟”数据,每个“时间”都有虚拟值。
数据透视表可以轻松处理这个问题。数据透视表将对数据进行排序,您可以过滤数据以仅显示某些类别(在本例中为“命令”)。
建立数据透视表后,您可以单击数据透视表中的任意位置并“插入”图表。见下文。
答案2
要分析 Linux 系统上进程的资源消耗情况,您可以使用进程路径(我是这本书的作者)。内置SVG 可视化无法生成您想要的图表(它显示 PID 而不是命令名称),但可以在特别指定可视化工具(即数据库)除了 Python 和浏览器之外,你不需要额外的软件。
记录
我假设(查看pidstat
参数)您想要记录系统上所有进程的活动并决定在稍后阶段要可视化什么。
通过以下方式安装 Procpath
pip install --user Procpath
以默认 10 秒间隔记录所有进程的 procfs 指标
procpath record -d all.sqlite
Ctrl + C
当你认为已经收集到足够的数据时,停止记录
可视化
运行 Sqliteviz 并删除记录数据库
all.sqlite
,procpath explore
您可以使用预定义每个 PID 的 CPU可视化(我的查询) 作为基础。添加
stat_comm
到两个SELECT
语句中。根据您想要可视化的进程添加WHERE
到 inner 中 。如果您对和感兴趣,那么它可以是 (我将使用我在系统上运行的几个 Cinnamon 进程)。修改后的查询看起来像:SELECT
kjournald
gnome-panel
WHERE stat_comm IN ('kjournald', 'gnome-panel')
WITH diff AS ( SELECT ts, stat_pid, stat_comm, stat_utime + stat_stime - LAG(stat_utime + stat_stime) OVER ( PARTITION BY stat_pid ORDER BY record_id ) tick_diff, ts - LAG(ts) OVER ( PARTITION BY stat_pid ORDER BY record_id ) ts_diff FROM record WHERE stat_comm IN ('cinnamon', 'cinnamon-screen') ) SELECT datetime(ts, 'unixepoch', 'localtime') ts, stat_pid, stat_comm, 100.0 * tick_diff / (SELECT value FROM meta WHERE key = 'clock_ticks') / ts_diff cpu_load FROM diff
将拆分字段从 更改为
stat_pid
(stat_comm
在变换)运行查询并打开图表标签
或者,您可以在 UI 中调整图表(例如设置标题)或 SQL 查询,例如,通过将
cpu_load
表达式替换为以下窗口函数表达式来添加 10 条记录的移动平均值以平滑线条AVG( 100.0 * tick_diff / (SELECT value FROM meta WHERE key = 'clock_ticks') / ts_diff ) OVER ( PARTITION BY stat_pid ORDER BY ts ROWS BETWEEN 9 PRECEDING AND CURRENT ROW ) cpu_load