我在如下所示的文件夹中有多个文件(文本文件),其中第一个文件包含一些路径作为字符串,另一个文件包含路径+文件名。
文件1:
# pwd
/root/test
# cat file1.txt
/abc/bce/12345/input/part3
/abc/bce/12345/input/part3/err
/abc/bce/34563/input
/abc/bce/34563/input/part1
/abc/bce/34563/input/part3/wrk
/abc/bce/11198/input/VII
/abc/bce/11198/input/VII/err
/abc/bce/11198/input/VII/part3
/abc/bce/11198/input/VII/part3/err
#
文件2:
# pwd
/root/test/test
# cat file2.txt
/abc/bce/12345/input/part3/AIR9905.txt--20210421--
/abc/bce/12345/input/part3/AIR9923.txt--20210315--
/abc/bce/12345/input/part3/err/AIR9950.txt--20200512--
#
文件3:
# pwd
/root/test/test
# cat file3.txt
/abc/bce/12345/input/part3/err/AIR1034.txt--20210110--
/abc/bce/34563/input/part1/AIR3426.txt--20200420--
/abc/bce/11198/input/VII/part3/err/V.AIR7650.txt--20170625--
#
当前产量:
test/file2.txt:/abc/bce/12345/input/part3/AIR9905.txt--20210421--
test/file2.txt:/abc/bce/12345/input/part3/AIR9923.txt--20210315--
test/file2.txt:/abc/bce/12345/input/part3/err/AIR9950.txt--20200512--
test/file3.txt:/abc/bce/12345/input/part3/err/AIR1034.txt--20210110--
预期输出:
test/file2.txt:/abc/bce/12345/input/part3/AIR9905.txt--20210421--
test/file2.txt:/abc/bce/12345/input/part3/AIR9923.txt--20210315--
我用grep -rHw "/abc/bce/12345/input/part3" test/
它来匹配 file1 中的行并从 file2、file3、.... 等中提取它们的信息。然而问题在于,当我从 file1 中取出第一行并尝试检索路径+File_name 时,它会从 file2、file3 等中提取所有类似的行。但我想获取 file2.txt、file3.txt 等中的行,这些行 (a) 包含字符串“/abc/bce/12345/input/part3”+ 一个附加字符串(文件名),但 (b) 不包含其他“/abc/bce/12345/input/part3/err”字符串。
I don't know how I can do that when file1 is being compared with multiple files in a continuous manner. I need a generalized solution for all cases. Please let me know if there are any other ways to get this done through the shell script.
答案1
这叫做负向预测(?!)
grep -Pr '/abc/bce/12345/input/part3/(?!err)' test/
如果您想使用 file1.txt 作为模式列表:
grep -e err$ test/file1.txt | grep --include='file[2-3].txt' -vrFf - test/test
答案2
find . -type f -name 'file[2-3].txt' -exec grep -P '/abc/bce/12345/input/part3/(?!err)' {} \;