无法存储 awk 命令的输出

Question 1

管道中的每个命令都作为单独的进程执行（即在子 shell 中）。

因此，当子 shell 退出时，变量中保存的输出total将丢失。运行以下命令即可看到此信息：

git clone --progress https://somerepo |& tee gitclone.file \
| tr \\r \\n | { total="$(awk '/Receiving objects/{print $3}')" ; \
 echo "$total" ; }

total由于上述命令行（即命令管道）结束后变量就会丢失，因此应该将整行放入“命令替换“像这样的括号：

total=$(git clone --progress https://somerepo |& tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}')
echo "$total"

但是，如果您希望管道（以命令开始git）在后台运行，则必须将awk的输出重定向到文件，然后读取该文件。例如：

tmpfile=$(mktemp)
git ... >"$tmpfile" &
# ...
# Do other stuff...
# ...
wait # for background process to complete.
total=$(cat "$tmpfile")
rm "$tmpfile"
echo "$total"

^{提示：重定向标准输出和标准错误命令的git命令tee可以使用|&这样的简写：git clone --progress https://somerepo |& tee gitclone.file |...}

Answer

作为bash 手册说：

管道中的每个命令都作为单独的进程执行（即在子 shell 中）。

因此，当子 shell 退出时，变量中保存的输出total将丢失。运行以下命令即可看到此信息：

git clone --progress https://somerepo |& tee gitclone.file \
| tr \\r \\n | { total="$(awk '/Receiving objects/{print $3}')" ; \
 echo "$total" ; }

total由于上述命令行（即命令管道）结束后变量就会丢失，因此应该将整行放入“命令替换“像这样的括号：

total=$(git clone --progress https://somerepo |& tee gitclone.file | tr \\r \\n | awk '/Receiving objects/{print $3}')
echo "$total"

但是，如果您希望管道（以命令开始git）在后台运行，则必须将awk的输出重定向到文件，然后读取该文件。例如：

tmpfile=$(mktemp)
git ... >"$tmpfile" &
# ...
# Do other stuff...
# ...
wait # for background process to complete.
total=$(cat "$tmpfile")
rm "$tmpfile"
echo "$total"

^{提示：重定向标准输出和标准错误命令的git命令tee可以使用|&这样的简写：git clone --progress https://somerepo |& tee gitclone.file |...}

Question 2

我认为问题出在 git 的输出上。重写“接收对象：”行时，我没有完成新行。

你可以通过查看输出来判断

GIT_FLUSH=1 git clone --progress $repo 2>&1 | cat -bu

在第一次出现“接收”行之后，您将看不到行号。下面是一个例子，我将输出导入“od”，以使 \r 和 \n 可见：

0000200                   \n                       4  \t   R   e   c   e
0000220    i   v   i   n   g       o   b   j   e   c   t   s   :        
0000240        0   %       (   1   /   1   1   0   3   8   )  \r   R   e
0000260    c   e   i   v   i   n   g       o   b   j   e   c   t   s   :
0000300                0   %       (   4   9   /   1   1   0   3   8   )
0000320    ,       8   .   8   8       M   i   B       |       2   .   8
0000340    4       M   i   B   /   s  \r

逐行读取输入的程序（如 awk）在 git 完成之前将看不到这些行。

Answer

我认为问题出在 git 的输出上。重写“接收对象：”行时，我没有完成新行。

你可以通过查看输出来判断

GIT_FLUSH=1 git clone --progress $repo 2>&1 | cat -bu

在第一次出现“接收”行之后，您将看不到行号。下面是一个例子，我将输出导入“od”，以使 \r 和 \n 可见：

0000200                   \n                       4  \t   R   e   c   e
0000220    i   v   i   n   g       o   b   j   e   c   t   s   :        
0000240        0   %       (   1   /   1   1   0   3   8   )  \r   R   e
0000260    c   e   i   v   i   n   g       o   b   j   e   c   t   s   :
0000300                0   %       (   4   9   /   1   1   0   3   8   )
0000320    ,       8   .   8   8       M   i   B       |       2   .   8
0000340    4       M   i   B   /   s  \r

逐行读取输入的程序（如 awk）在 git 完成之前将看不到这些行。

Question 3

从根本上讲，您遇到了管道缓冲问题。管道中的程序使用的输入和/或输出缓冲区太大。幸运的是，有一种方法可以告诉管道中的每个程序只缓冲一行。

这是您需要的程序： https://manpages.ubuntu.com/manpages/bionic/man1/unbuffer.1.html。

我认为它是在 Ubuntu 桌面中默认安装的，但如果没有：

sudo apt install expect

然后您可以unbuffer在管道中包含该命令来解决问题：

REPO_URL = https://something or git@something
unbuffer git clone --progress $REPO_URL 2>&1 | \
  unbuffer  -p tr \\r \\n | \
  { awk '/Receiving objects/{print $3}' ;  echo "$total" ; }

它打印 0%、1%、...100%，然后因为“总计”是其中的最后一个，所以再次打印 100%，并且它会随着进度的进行而打印，而不是在最后或大块地打印。

Answer