使用 awk 在不共享唯一值的行之后打印分隔线

使用 awk 在不共享唯一值的行之后打印分隔线

我有一个来自重复数据删除工具 (rmlint) 的 csv 文件。我想使用 awk 在给定字段(如校验和)共享相同值的行组之后添加一条潜水线。

文件:

type,path,size,checksum
emptydir,"/home/user/tree2/b",0,00000000000000000000000000000000
duplicate_dir,"/home/user/test/b",4,f8772f6fda08bbc826543334663d6f13
duplicate_dir,"/home/user/test/a",4,f8772f6fda08bbc826543334663d6f13
duplicate_dir,"/home/user/tree/b",8,62202a79add28a72209b41b6c8f43400
duplicate_dir,"/home/user/tree/a",8,62202a79add28a72209b41b6c8f43400
duplicate_dir,"/home/user/tree2/a",4,311095bc5669453990cd205b647a1a00

期望的输出:

type,path,size,checksum
------------------------
emptydir,"/home/user/tree2/b",0,00000000000000000000000000000000
------------------------
duplicate_dir,"/home/user/test/b",4,f8772f6fda08bbc826543334663d6f13
duplicate_dir,"/home/user/test/a",4,f8772f6fda08bbc826543334663d6f13
------------------------
duplicate_dir,"/home/user/tree/b",8,62202a79add28a72209b41b6c8f43400
duplicate_dir,"/home/user/tree/a",8,62202a79add28a72209b41b6c8f43400
------------------------
duplicate_dir,"/home/user/tree2/a",4,311095bc5669453990cd205b647a1a00

我怎样才能用 awk 来做这件事呢?

答案1

存储先前的值,并在当前值不匹配时输出一行:

awk -F, 'NR > 1 && $NF != prev { print "------------------------" } { prev = $NF } 1'

$NF是最后一个字段,调整以适应。)

破折号的数量可以更容易地自定义,如下所示:

awk -F, 'BEGIN { line = sprintf("%20s", ""); gsub(/ /, "-", line) }
         NR > 1 && $NF != prev { print line } { prev = $NF } 1'

或者,使用 GNU AWK:

awk -F, 'BEGIN { line = gensub(/ /, "-", "g", sprintf("%20s", "")) }
         NR > 1 && $NF != prev { print line } { prev = $NF } 1'

将“20”更改为任何合适的值。

答案2

这是另一个版本,您可以在其中选择分隔线的长度(本例中为 20):

$ awk -F, -v i=$(printf -- '-%.0s' {1..20}) '{if(a==$NF){print;next}else{print i};print;a=$NF}' inputfile

type,path,size,checksum
--------------------
emptydir,"/home/user/tree2/b",0,00000000000000000000000000000000
--------------------
duplicate_dir,"/home/user/test/b",4,f8772f6fda08bbc826543334663d6f13
duplicate_dir,"/home/user/test/a",4,f8772f6fda08bbc826543334663d6f13
--------------------
duplicate_dir,"/home/user/tree/b",8,62202a79add28a72209b41b6c8f43400
duplicate_dir,"/home/user/tree/a",8,62202a79add28a72209b41b6c8f43400
--------------------
duplicate_dir,"/home/user/tree2/a",4,311095bc5669453990cd205b647a1a00

相关内容