文件1:源文件.txt
Hello, It's the beginning of the sentence.
it is the beginpoint of my career.
The end is always far.
We can start our beginpoint anytime we want.
The time we utilise to make our life good should be more.
This text doesn't mean anything.
I am writing this to include my three points:
beginpoint
time
end
文件2:字符串.txt
beginpoint
end
time
所需输出:
it is the beginpoint of my career
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time
我用了
grep -w -F -f strings.txt sorcefile.txt > outputfile.txt
我得到输出:
it is the beginpoint of my career.
The end is always far.
We can start our beginpoint anytime we want.
The time we utilise to make our life good should be more.
beginpoint
time
end
因此,这些行是根据需要的,但我想按搜索词顺序对它们进行分组,而不是按照与源文件相同的顺序
答案1
grep
一种方法是每行调用一次strings.txt
$ while IFS= read -r line; do grep -wF "$line" sourcefile.txt; done < strings.txt
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time
如果strings.txt
文件太长,速度可能会很慢,请参阅
为什么使用 shell 循环处理文本被认为是不好的做法?
如果sed
它支持e
标志:
$ sed 's/.*/grep -wF '"'&'"' sourcefile.txt/e' strings.txt
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time
答案2
假设您的字符串列表不包含空格,如您的示例所示:
$ awk -F'[^[:alnum:]_]+' '
NR==FNR { strs[$0]; next }
{ for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0 }
' file2 file1 | sort -k1,1 -k2,2n | cut -d' ' -f3-
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time
上面的工作原理不仅打印包含匹配字符串的行,还打印匹配的字符串加上匹配的行号(以在排序后保留相对顺序 - 如果我们使用 GNU sort for ,则不需要-s
)然后排序,然后删除在第一步中添加的装饰。这是一步一步:
$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1
beginpoint 2 it is the beginpoint of my career.
end 3 The end is always far.
beginpoint 4 We can start our beginpoint anytime we want.
time 5 The time we utilise to make our life good should be more.
beginpoint 8 beginpoint
time 9 time
end 10 end
。
$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1 | sort -k1,1 -k2,2n
beginpoint 2 it is the beginpoint of my career.
beginpoint 4 We can start our beginpoint anytime we want.
beginpoint 8 beginpoint
end 3 The end is always far.
end 10 end
time 5 The time we utilise to make our life good should be more.
time 9 time
。
$ awk -F'[^[:alnum:]_]+' 'NR==FNR{strs[$0];next} {for (str in strs) for (i=1; i<=NF; i++) if ($i==str) print str, FNR, $0}' file2 file1 |
sort -k1,1 -k2,2n | cut -d' ' -f3-
it is the beginpoint of my career.
We can start our beginpoint anytime we want.
beginpoint
The end is always far.
end
The time we utilise to make our life good should be more.
time