管道和流程替代之间的性能差异

管道和流程替代之间的性能差异

在大多数情况下,我倾向于在 bash 脚本中使用管道而不是进程替换,特别是在使用多组命令的情况下,因为这样做似乎更具... | ... | ...可读性... < <(... < <(...))

我想知道为什么在某些情况下使用进程替换比使用管道要快得多。

为了测试这一点,我time使用相同附加命令的迭代创建了两个脚本10000,其中一个使用管道,另一个使用进程替换。

脚本:

pipeline.bash:

for i in {1..10000}; do
    echo foo bar |
    while read; do
        echo $REPLY >/dev/null
    done
done

proc-sub.bash

for i in {1..10000}; do
    while read; do
        echo $REPLY >/dev/null
    done < <(echo foo bar)
done

结果:

~$ time ./pipeline.bash

real    0m17.678s
user    0m14.666s
sys     0m14.807s

~$ time ./proc-sub.bash

real    0m8.479s
user    0m4.649s
sys     0m6.358s

我知道管道创建子进程,而进程替换创建命名管道或某些文件/dev/fd,但不清楚这些差异如何影响性能。

答案1

做同样的事情strace,你可以看到差异:

pipe

$ strace -c ./pipe.sh 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 57.89    0.103005           5     20000           clone
 40.81    0.072616           2     30000     10000 wait4
  0.58    0.001037           0    120008           rt_sigprocmask
  0.40    0.000711           0     10000           pipe

proc-sub

$ strace -c ./procsub.sh 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 85.08    0.045502           5     10000           clone
  3.25    0.001736           0     90329       322 read
  2.12    0.001133           0     20009           open
  2.03    0.001086           0     50001           dup2

通过以上统计数据,您可以看到pipe创建了更多的子进程(clone系统调用)并花费了很多时间等待子进程(wait4系统调用)完成以便父进程继续执行。

Process substitution不是。它可以直接从子进程读取。Process substitution与参数和变量扩展同时执行,该命令Process Substitution在后台运行。从bash manpage

Process Substitution
       Process  substitution  is supported on systems that support named pipes
       (FIFOs) or the /dev/fd method of naming open files.  It takes the  form
       of  <(list) or >(list).  The process list is run with its input or out‐
       put connected to a FIFO or some file in /dev/fd.  The name of this file
       is  passed  as  an argument to the current command as the result of the
       expansion.  If the >(list) form is used, writing to the file will  pro‐
       vide  input  for list.  If the <(list) form is used, the file passed as
       an argument should be read to obtain the output of list.

       When available, process substitution is performed  simultaneously  with
       parameter  and variable expansion, command substitution, and arithmetic
       expansion.

更新

对子进程的统计数据进行 strace:

pipe

$ strace -fqc ./pipe.sh 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 70.76    0.215739           7     30000     10000 wait4
 28.04    0.085490           4     20000           clone
  0.78    0.002374           0    220008           rt_sigprocmask
  0.17    0.000516           0    110009     20000 close
  0.15    0.000456           0     10000           pipe

proc-sub

$ strace -fqc ./procsub.sh 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 52.38    0.033977           3     10000           clone
 32.24    0.020913           0     96070      6063 read
  5.24    0.003398           0     20009           open
  2.34    0.001521           0    110003     10001 fcntl
  1.87    0.001210           0    100009           close

相关内容