进程运行这么长时间后的执行时间和资源

进程运行这么长时间后的执行时间和资源

这可能是一个基本问题,但我很难找到答案。我使用的是RHEL6。运行任何占用大量 CPU 时间的进程后,我会得到一行自动告诉(我猜)运行时间和 IO 使用情况的信息。它读起来像:

77410.101u 124.968s 1:42:43.49 657.9%    0+0k 0+1353384io 0pf+0w

我有以下问题:

  1. 如何解释此消息中的每个字段?我可以猜测一些时间、IO 使用情况,也可能是 CPU 使用情况...但我不确定。
  2. 实际上是什么打印了这一行?是壳吗?是终端模拟器吗?是否有一个正在运行的守护进程来处理这个问题?该功能/服务/无论它是什么。
  3. 可以控制这个消息吗?比如设置打印的cpu使用阈值吗?
  4. 我可以添加额外的信息吗?比如进程的绝对路径、内存使用峰值、磁盘使用峰值等等...

答案1

调试正在发生的情况的提示

我建议您打开 shell 的调试工具,假设您使用的是 Bash。

$ set -x

此输出将向您显示当您运行生成此输出的命令时幕后发生的情况。

输出

该输出来自于/usr/bin/time您运行的每个命令的前缀 time 命令。为了获得该输出,我猜您正在使用 C-shell (csh) 或 Turbo C-shell (tcsh)。

例子

$ tcsh
$ time sleep 2
0.000u 0.000s 0:02.00 0.0%  0+0k 0+0io 0pf+0w

我怀疑这是一个 shell 的原因tcsh是,当我在 Bash shell 中运行命令时,/usr/bin/time输出如下所示:

$ /usr/bin/time sleep 2
0.00user 0.00system 0:02.02elapsed 0%CPU (0avgtext+0avgdata 580maxresident)k
0inputs+0outputs (0major+180minor)pagefaults 0swaps

-f可以使用or开关控制输出--format,因此您看到的输出也可以在 Bash 中实现,但必须有意完成。

输出含义

/usr/bin/time如果您在详细模式下运行该命令,( -v) 您将获得有关每个字段的所有详细信息,如下所示:

$ /usr/bin/time -v sleep 2
    Command being timed: "sleep 2"
    User time (seconds): 0.00
    System time (seconds): 0.00
    Percent of CPU this job got: 0%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.00
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 584
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 184
    Voluntary context switches: 2
    Involuntary context switches: 4
    Swaps: 0
    File system inputs: 0
    File system outputs: 0
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

如果你排列原始输出:

77410.101u 124.968s 1:42:43.49 657.9%    0+0k 0+1353384io 0pf+0w
^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^ ^^^^^^    ^^^^ ^^^^^^^^^^^ ^^^^^^
     1        2         3       4         5        6        7
  1. 用户时间(秒)
  2. 系统时间(秒)
  3. 经过(挂钟)时间(h:mm:ss 或 m:ss)
  4. 该作业占用的 CPU 百分比
  5. 平均共享文本大小(千字节)+ 平均非共享数据大小(千字节)
  6. 进程的文件系统输入数 + 进程的文件系统输出数
  7. 进程运行时发生的主要页面错误数。这些是必须从磁盘读入页面的错误 + 进程从主内存换出的次数

您可以手动执行相同的格式,如下所示:

$ /usr/bin/time -f '%Uu %Ss %E %P %X+%Dk %I+%Oio %Fpf+%Ww' sleep 2
0.00u 0.00s 0:02.00 0% 0+0k 0+0io 0pf+0w

自定义输出

一旦您能够确定在何处/usr/bin/time进行调用,您就可以通过在 的手册页中获取峰值来自定义输出time。此输出中可以包含许多选项。

$ man time

摘抄

   Time
   %E     Elapsed real time (in [hours:]minutes:seconds).
   %e     (Not in tcsh.) Elapsed real time (in seconds).
   %S     Total number of CPU-seconds that the process spent in kernel mode.
   %U     Total number of CPU-seconds that the process spent in user mode.
   %P     Percentage of the CPU that this job got, computed as (%U + %S) / %E.

   Memory
   %M     Maximum resident set size of the process during its lifetime, in Kbytes.
   %t     (Not in tcsh.) Average resident set size of the process, in Kbytes.
   %K     Average total (data+stack+text) memory use of the process, in Kbytes.
   %D     Average size of the process's unshared data area, in Kbytes.
   %p     (Not in tcsh.) Average size of the process's unshared stack space, in Kbytes.
   %X     Average size of the process's shared text space, in Kbytes.
   %Z     (Not in tcsh.) System's page size, in bytes.  This is a per-system constant, but varies between systems.
   %F     Number  of major page faults that occurred while the process was running.  These are faults where the page has to be read
          in from disk.
   %R     Number of minor, or recoverable, page faults.  These are faults for pages that are not valid but which have not yet  been
          claimed by other virtual pages.  Thus the data in the page is still valid but the system tables must be updated.
   %W     Number of times the process was swapped out of main memory.
   %c     Number of times the process was context-switched involuntarily (because the time slice expired).
   %w     Number of waits: times that the program was context-switched voluntarily, for instance while waiting for an I/O operation
          to complete.

   I/O
   %I     Number of file system inputs by the process.
   %O     Number of file system outputs by the process.
   %r     Number of socket messages received by the process.
   %s     Number of socket messages sent by the process.
   %k     Number of signals delivered to the process.
   %C     (Not in tcsh.) Name and command-line arguments of the command being timed.
   %x     (Not in tcsh.) Exit status of the command.

有关更多详细信息,请参阅手册页。

编辑#1:你的问题

事实证明,您关于自动显示输​​出的问题是由 csh/tcsh 中此环境变量的设置引起的。

来自 tcsh 手册页

   The time shell variable can be set to execute the time builtin command 
     after the completion of any process that takes more  than a given number 
     of CPU seconds.

例子

将时间设置为 5 秒。

$ set time=5

确认:

$ set|grep time
time    5

测试一下:

$ bash -c "while [ 1 ];do echo hi; done"
hi
hi
...
...waited ~5 seconds, then Ctrl-C to stop it

5.650u 1.471s 0:09.68 73.5% 0+0k 0+0io 0pf+0w

仅当您正在运行的任务消耗的 CPU 时间超过变量设置的秒数时,才会显示输出$time

参考

相关内容