在这样的文件中:
...
Pattern2:TheWrongBar
foo
Pattern2:TheRightBar
foo
First Pattern
foo
...
我需要找到最后一次出现的情况是在这种情况Pattern2
之前First Pattern
Pattern2:TheRightBar
我的第一个想法是获取之前的所有剩余文件First pattern
:
sed -e '/First Pattern/,$d' myfile | tac | grep -m1 "Pattern I need to get"
难道就没有办法优化这段代码吗?
答案1
和awk
:
awk '/Pattern2/ {line=$0; next}; /First Pattern/ {print line; exit}' file.txt
/Pattern2/ {line=$0; next}
:如果模式Pattern2
匹配,则将该行保存在变量中line
,并转到下一行/First Pattern/ {print line; exit}
:如果First Pattern
找到,打印变量line
,然后退出
例子:
% cat file.txt
...
Pattern2:TheWrongBar
foo
Pattern2:TheRightBar
foo
First Pattern
foo
...
% awk '/Pattern2/ {line=$0; next}; /First Pattern/ {print line; exit}' file.txt
Pattern2:TheRightBar
答案2
你可以跑
sed '/PATTERN2/h;/PATTERN1/!d;x;/PATTERN2/!d;q' infile
怎么运行的:
sed '/PATTERN2/h # if line matches PATTERN2 save it to hold buffer
/PATTERN1/!d # if it doesn't match PATTERN1 delete it
x # exchange buffers
/PATTERN2/!d # if current pattern space doesn't match delete it
q' infile # quit (auto-printing the current pattern space)
PATTERN2
仅当在某些行匹配之前至少有一行匹配时才会退出,因此PATTERN1
输入如下
1
2
PATTERN1
PATTERN2--1st
3
PATTERN2--2nd
PATTERN1
...
它会打印
PATTERN2--2nd
如果你想在第一场比赛中退出PATTERN1
,你可以运行
sed -n '/PATTERN2/h;/PATTERN1/!d;x;/PATTERN2/p;q' infile
上面的输入不打印任何内容(这与您的解决方案的作用完全一样)。
答案3
查找“第一个模式”的行数,然后使用 head 显示其上方的行,通过 tac 进行管道传输并对其进行 grep。
head --lines=+"$(grep -nm1 "First Pattern" file | cut -d\: -f1)" file | tac | grep -m1 "Pattern2"
例如。
head --lines=+6 file | tac | grep -m1 "Pattern2"
这比在 grep 中使用 -m 1000000 更可靠。由于速度对 OP 来说很重要,我检查了运行时间,它似乎也比所有其他当前答案都快(在我的系统上)
wc -l file
25910209 file
time awk '/Pattern2/ {line=$0; next}; /First Pattern/ {print line; exit}' file
Pattern2:TheRightBar
real 0m2.881s
user 0m2.844s
sys 0m0.036s
time sed '/Pattern2/h;/First Pattern/!d;x;/Pattern2/!d;q' file
Pattern2:TheRightBar
real 0m5.218s
user 0m5.192s
sys 0m0.024s
time (grep -m1 "First Pattern" file -B 10000000 | tac | grep -m1 "Pattern2")
real 0m0.624s
user 0m0.552s
sys 0m0.124s
time (head --lines=+"$(grep -nm1 "First Pattern" file | cut -d\: -f1)" file | tac | grep -m1 "Pattern2")
Pattern2:TheRightBar
real 0m0.586s
user 0m0.528s
sys 0m0.160s
答案4
事实证明最有效的方法就我而言曾是:
grep -m1 "First Pattern" my_file -B 10000000 | tac | grep -m1 "Pattern2"
显然,该-B
选项不能在某些示例中使用,但比我使用该解决方案grep
要快得多。如果选项的值变高,搜索效率就会降低。awk
sed
-B