我的输入是这样的:
fruit apple word
fruit lemon
fruit orange other word
meat ham word
vegetable salad other
vegetable lettuce more
如何根据第一个单词用空行分隔重复行?像这样:
fruit apple word
fruit lemon other word
fruit orange word
meat ham word
vegetable salad other
vegetable lettuce more
编辑:我忘了提到第一个单词后面可能有空格。
答案1
这是一个基本命令,您可以根据自己的个人需求进行定制。
awk '{print $0 > $1}' inputfile
编辑:抱歉,我刚刚意识到我误读了你的问题,这不是正确的答案,尽管你可以很容易地用空行“重新加入”文件
这是一个可能的解决方案
for file in $(awk '{print $1; print $0 > $1}' data.txt | sort | uniq)
do
cat $file
echo
rm $file
done > output.txt
如果文件已预先排序,则仅使用 awk 解决方案:
awk '{a=$1; if (b != "" && a != b) {printf "\n";}; print $0; b = a}' inputfile
根据 don_crissti 的评论进行了修改(谢谢!)
awk '{if (a != "" && a != $1) {printf "\n";}; print $0; a = $1}' inputfile
答案2
这sed解决方案可以是
sed '
/^\n/!{ #if line do not starts from \newline
N #attach next line
/^\(\w\+\b\).*\n\1/! s/\n/\n\n/ #if 1st word not a same insert \newline
}
P #print 1st line (before \newline)
D #remove 1st line, return to start
'
答案3
另一种awk
解决方案,假设输入已排序,如示例输入所示
$ cat ip.txt
fruit apple word
fruit lemon
fruit orange other word
meat ham word
vegetable salad other
vegetable lettuce more
注意:条件检查的顺序很重要。
$ awk '!seen[$1]++ && NR>1{printf "\n"} 1' ip.txt
fruit apple word
fruit lemon
fruit orange other word
meat ham word
vegetable salad other
vegetable lettuce more
类似的解决方案在perl
$ perl -ane 'print "\n" if !$seen{$F[0]}++ && $. > 1; print' ip.txt
fruit apple word
fruit lemon
fruit orange other word
meat ham word
vegetable salad other
vegetable lettuce more