我有一个这样的文件:
abc:: vvnm\/asj\/pqr
sadnck
acdsd
abc:: kfjwej\/asj\/pqr
frtrt
ewrfe
adsf
abc:: flkm\/csj\/lqr
abc:: kmflkm\/asj\/pqr
sdvd
dfff
我想像这样输出[在:abc之后,计数单元格]
3 kfjwej/asj/pqr
2 vvnm/asj/pqr
2 kmflkm/asj/pqr
0 flkm/csj/lqr
答案1
awk
解决方案:
awk '/^[0-9]+:abc /{
if (abc) print count abc;
sub(/^[0-9]+/, "");
abc = $0; count = 0; next
}
abc{ count++ }
END{ print count abc }' file
输出:
2:abc vvvvv
3:abc kfjwej
2:abc kmflkm
使用新文件格式的附加(也是最后)方法awk
源文件:
count_abc.awk
脚本:
#!/bin/awk -f
/^abc::/{
if (abc) print count, abc;
gsub(/\\/, "", $2);
abc = $2; count = 0; next
}
abc { count++ }
END { print count, abc }
用法:
awk -f count_abc.awk newfile
输出:
2 vvnm/asj/pqr
3 kfjwej/asj/pqr
2 kmflkm/asj/pqr
答案2
sed
使用、uniq
和的组合awk
:
$ sed '/^[^[:space:]]/{s/^[^[:space:]]* //g;s#\\##g;h;}; g' file | uniq -c | awk '{ $1 -= 1; print }'
2 vvnm/asj/pqr
3 kfjwej/asj/pqr
0 flkm/csj/lqr
2 kmflkm/asj/pqr
脚本sed
,注释:
/^[^[:space:]]/{ # this line starts with a non-space
s/^[^[:space:]]* //; # remove the thing that is not a space, up to the space
s#\\##g; # remove backslashes
h; # store in hold space
};
g; # get hold space
# (implicit print)
其作用是将每个“子标题行”替换为其相应的“标题行”,生成
vvnm/asj/pqr
vvnm/asj/pqr
vvnm/asj/pqr
kfjwej/asj/pqr
kfjwej/asj/pqr
kfjwej/asj/pqr
kfjwej/asj/pqr
flkm/csj/lqr
kmflkm/asj/pqr
kmflkm/asj/pqr
kmflkm/asj/pqr
然后我们uniq
计算连续的唯一行的数量,生成
3 vvnm/asj/pqr
4 kfjwej/asj/pqr
1 flkm/csj/lqr
3 kmflkm/asj/pqr
使用awk
,我们只需减少第一个字段中的数字即可。