如何循环遍历文件并使每一行成为 awk 语句中的新正则表达式？

Question 1

注意：没有错误检查。另外，假设第二个文件中的输入完全遵循提到的模式。

awk 'NR== FNR {a[$0] = $0 ; next } {if (!($0 in a)) {b[count++] = $0; } else {count--; if (count > 0) delete b[count];getline;getline; }} END {for (i=0; i<count; i++) print b[i] }' 1 2

输入在 1 和 2 中

1

ATGCATGC
GGGGGGTT
TTTTT
AAAA

2

asdfasdf
blah2
ATGCATGC
blah3
blah4 
delte-me-too
GGGGGGTT
blah5
blah5
foo
foo-delete
AAAA
bar-delete
bar-delete
bar-ok

输出

asdfasdf
foo
bar-ok

Answer

注意：没有错误检查。另外，假设第二个文件中的输入完全遵循提到的模式。

awk 'NR== FNR {a[$0] = $0 ; next } {if (!($0 in a)) {b[count++] = $0; } else {count--; if (count > 0) delete b[count];getline;getline; }} END {for (i=0; i<count; i++) print b[i] }' 1 2

输入在 1 和 2 中

1

ATGCATGC
GGGGGGTT
TTTTT
AAAA

2

asdfasdf
blah2
ATGCATGC
blah3
blah4 
delte-me-too
GGGGGGTT
blah5
blah5
foo
foo-delete
AAAA
bar-delete
bar-delete
bar-ok

输出

asdfasdf
foo
bar-ok

Question 2

下一个代码不是最佳的（因为它必须读取文件B.txt两次）但希望能更快awk

comm --nocheck-order -23 FileB.txt <(grep -B1 -A2 -Ff FileA.txt FileB.txt)

与新GNU sed您可以尝试的命令e（为了节省内存）sed+grep:

sed 'N;h;s/.*\n//;s/.*/grep -xF "&" FileA.txt/e;/./{N;N;d};x;P;D' FileB.txt

Answer

下一个代码不是最佳的（因为它必须读取文件B.txt两次）但希望能更快awk

comm --nocheck-order -23 FileB.txt <(grep -B1 -A2 -Ff FileA.txt FileB.txt)

与新GNU sed您可以尝试的命令e（为了节省内存）sed+grep:

sed 'N;h;s/.*\n//;s/.*/grep -xF "&" FileA.txt/e;/./{N;N;d};x;P;D' FileB.txt

Question 3

这适用于您的样本

awk '
    NR==FNR {patt[$0]; next} 
    $0 in patt {getline; getline; getline; prev=$0; next} 
    {print prev; prev=$0} 
    END {print prev}
' fileA.txt fileB.txt

你必须将文件 A 的所有内容保存在内存中，但你只需要一次记住文件 B 中的一行

Answer

这适用于您的样本

awk '
    NR==FNR {patt[$0]; next} 
    $0 in patt {getline; getline; getline; prev=$0; next} 
    {print prev; prev=$0} 
    END {print prev}
' fileA.txt fileB.txt

你必须将文件 A 的所有内容保存在内存中，但你只需要一次记住文件 B 中的一行

如何循环遍历文件并使每一行成为 awk 语句中的新正则表达式？

答案1

答案2

答案3

相关内容