Bash：为什么 read 在读取我的文件时返回非零退出状态？

2024-5-26 • tag-icon

我正在尝试在 Bash 中逐行读取两个文件，并对它们的每一行执行一些操作。这是我的 Bash 脚本：

#!/usr/bin/env bash

die()
{
    echo "$@" >&2
    exit 1
}

extract_char()
{
    echo "$1" | sed "s/.*'\([^']*\)'.*/\1/g"
}

file1=$1 # old
file2=$2 # new
counter=0

win_count=0
lose_count=0

test ! -z "$file1" || die "Please enter 2 files."
test ! -z "$file2" || die "Please enter 2 files."

while read -r line1 && read -r line2 <&3
do
    let counter++
    index=$(expr index "$line1" "'")
    if [ $index -ne 0 ]; then
        char=$(extract_char "$line1")
        char2=$(extract_char "$line2")
        test "$char" = "$char2" || die "Chars in line1 and line2 were not the same."
    elif [ "${line1#char.}" != "$line1" ]; then
        test "${line2#char.}" != "$line2" || die "Method signature found in line1, but not line2."
        method=${line1%:}
        method=${method#char.}
    elif ! grep -q '[^[:space:]]'; then
        # benchmark times
        if [ $(date --date="$line1" +%s%N) -gt $(date --date="$line2" +%s%N) ]; then
            echo "$char $method $counter: $line1 is greater than $line2"
            let lose_count++
        else
            let win_count++
        fi
    fi
done < "$file1" 3< "$file2"

echo
echo "Lines where this made an improvement: $win_count"
echo "Lines where this made a regression: $lose_count"

它的用法是这样的：

./compare.sh oldresults.txt newresults.txt

其中oldresults.txt和newresults.txt是包含基准测试结果的两个文件。这是一个示例文件：

Test results for '\u0020':

char.IsUpper:
00:00:00.1231231
00:00:00:4564564

char.IsLower:
00:00:00:3453455
00:11:22:4444444

Tests for '\u1234':

# and so on

由于某种原因，似乎read在完成读取文件之前返回非零退出状态。这是我调试脚本时的输出（通过bash --debug -x compare.sh [args]）：

+ file1=oldresults.txt
+ file2=newresults.txt
+ counter=0
+ win_count=0
+ lose_count=0
+ test '!' -z oldresults.txt
+ test '!' -z newresults.txt
+ read -r line1
+ read -r line2
+ let counter++
++ expr index 'Test results for '\''\u0020'\'':
' \'
+ index=18
+ '[' 18 -ne 0 ']'
++ extract_char 'Test results for '\''\u0020'\'':
'
++ echo 'Test results for '\''\u0020'\'':
'
++ sed 's/.*'\''\([^'\'']*\)'\''.*/\1/g'
+ char='\u0020'
++ extract_char 'Test results for '\''\u0020'\'':
'
++ echo 'Test results for '\''\u0020'\'':
'
++ sed 's/.*'\''\([^'\'']*\)'\''.*/\1/g'
+ char2='\u0020'
+ test '\u0020' = '\u0020'
+ read -r line1
+ read -r line2
+ let counter++
++ expr index $'\r' \'
+ index=0
+ '[' 0 -ne 0 ']'
+ '[' $'\r' '!=' $'\r' ']'
+ grep -q '[^[:space:]]'
+ read -r line1 # exits the loop here
+ echo

+ echo 'Lines where this made an improvement: 0'
Lines where this made an improvement: 0
+ echo 'Lines where this made a regression: 0'
Lines where this made a regression: 0

正如您所看到的，该脚本迭代了两行：首先是“Test results for...”行，它\u0020从引号之间提取，然后是回车符。之后，read -r line1神秘地似乎失败并退出循环。

为什么会发生这种情况？我可以采取什么措施来解决它？谢谢。

答案1

正在发生的事情是grep -q '[^[:space:]]'正在处理标准输入中的剩余行（grep如果您没有给它任何输入，则默认情况下会执行此操作），不为下一步留下任何内容read- 文件指针位于 EOF。你想要的是grep -q '[^[:space:]]' <<< "$line1"。

避免这种错误的一个简单方法是，如果您的循环代码很重要，则始终使用非默认文件描述符。有很多方法可以最终在单个命令中吞掉所有 stdin，但我还没有遇到任何默认情况下尝试读取 FD 3 及更高版本的程序。

答案1

相关内容