我正在尝试在 Bash 中逐行读取两个文件,并对它们的每一行执行一些操作。这是我的 Bash 脚本:
#!/usr/bin/env bash
die()
{
echo "$@" >&2
exit 1
}
extract_char()
{
echo "$1" | sed "s/.*'\([^']*\)'.*/\1/g"
}
file1=$1 # old
file2=$2 # new
counter=0
win_count=0
lose_count=0
test ! -z "$file1" || die "Please enter 2 files."
test ! -z "$file2" || die "Please enter 2 files."
while read -r line1 && read -r line2 <&3
do
let counter++
index=$(expr index "$line1" "'")
if [ $index -ne 0 ]; then
char=$(extract_char "$line1")
char2=$(extract_char "$line2")
test "$char" = "$char2" || die "Chars in line1 and line2 were not the same."
elif [ "${line1#char.}" != "$line1" ]; then
test "${line2#char.}" != "$line2" || die "Method signature found in line1, but not line2."
method=${line1%:}
method=${method#char.}
elif ! grep -q '[^[:space:]]'; then
# benchmark times
if [ $(date --date="$line1" +%s%N) -gt $(date --date="$line2" +%s%N) ]; then
echo "$char $method $counter: $line1 is greater than $line2"
let lose_count++
else
let win_count++
fi
fi
done < "$file1" 3< "$file2"
echo
echo "Lines where this made an improvement: $win_count"
echo "Lines where this made a regression: $lose_count"
它的用法是这样的:
./compare.sh oldresults.txt newresults.txt
其中oldresults.txt
和newresults.txt
是包含基准测试结果的两个文件。这是一个示例文件:
Test results for '\u0020':
char.IsUpper:
00:00:00.1231231
00:00:00:4564564
char.IsLower:
00:00:00:3453455
00:11:22:4444444
Tests for '\u1234':
# and so on
由于某种原因,似乎read
在完成读取文件之前返回非零退出状态。这是我调试脚本时的输出(通过bash --debug -x compare.sh [args]
):
+ file1=oldresults.txt
+ file2=newresults.txt
+ counter=0
+ win_count=0
+ lose_count=0
+ test '!' -z oldresults.txt
+ test '!' -z newresults.txt
+ read -r line1
+ read -r line2
+ let counter++
++ expr index 'Test results for '\''\u0020'\'':
' \'
+ index=18
+ '[' 18 -ne 0 ']'
++ extract_char 'Test results for '\''\u0020'\'':
'
++ echo 'Test results for '\''\u0020'\'':
'
++ sed 's/.*'\''\([^'\'']*\)'\''.*/\1/g'
+ char='\u0020'
++ extract_char 'Test results for '\''\u0020'\'':
'
++ echo 'Test results for '\''\u0020'\'':
'
++ sed 's/.*'\''\([^'\'']*\)'\''.*/\1/g'
+ char2='\u0020'
+ test '\u0020' = '\u0020'
+ read -r line1
+ read -r line2
+ let counter++
++ expr index $'\r' \'
+ index=0
+ '[' 0 -ne 0 ']'
+ '[' $'\r' '!=' $'\r' ']'
+ grep -q '[^[:space:]]'
+ read -r line1 # exits the loop here
+ echo
+ echo 'Lines where this made an improvement: 0'
Lines where this made an improvement: 0
+ echo 'Lines where this made a regression: 0'
Lines where this made a regression: 0
正如您所看到的,该脚本迭代了两行:首先是“Test results for...”行,它\u0020
从引号之间提取,然后是回车符。之后,read -r line1
神秘地似乎失败并退出循环。
为什么会发生这种情况?我可以采取什么措施来解决它?谢谢。
答案1
正在发生的事情是grep -q '[^[:space:]]'
正在处理标准输入中的剩余行(grep
如果您没有给它任何输入,则默认情况下会执行此操作),不为下一步留下任何内容read
- 文件指针位于 EOF。你想要的是grep -q '[^[:space:]]' <<< "$line1"
。
避免这种错误的一个简单方法是,如果您的循环代码很重要,则始终使用非默认文件描述符。有很多方法可以最终在单个命令中吞掉所有 stdin,但我还没有遇到任何默认情况下尝试读取 FD 3 及更高版本的程序。