在第一次出现的Cat
和下一次出现的之间Cat
,应该创建一个单独的行,分隔符为“,”。
文件输入如下。
Cat
AA
BB
CC
Cat
AA-1
BB-1
CC-1
预期输出:
Cat,AA,BB,CC
Cat,AA-1,BB-1,CC-1
答案1
使用 GNU sed:
sed ':a;N;s/\n/,/;ta' file | sed 's/,Cat/\nCAT/g'
或者
tr '\n' ',' < file | sed 's/,Cat/\nCAT/g'
答案2
你可以这样做sed
:
sed '1{h;d;};/^Cat$/!{H;$!d;};x;s/\n/,/g;${x;/^Cat$/H;x;}' infile
解释:
sed '1{ # if this is the 1st line
h # copy over the hold space
d # and delete it
}
/^Cat$/!{ # if the line doesn't match Cat
H # append to hold space and
$!d # delete it if it's not the last line
}
x # exchange pattern space w. hold buffer
s/\n/,/g # replace all newline chars with commas
${ # check if the last line of input matches Cat:
x # exchange pattern space w. hold buffer
/^Cat$/H # if the line matches Cat append it to hold buffer
x # exchange back
}' infile
答案3
awk
awk '
/Cat/ {
if (NR>1) print ""
printf "%s", $0
next
}
{printf ",%s", $0}
END {print ""}
' file
另一个严重依赖 awk 变量的版本:(在我阅读您关于“Cat”需要成为不区分大小写的正则表达式的评论之前添加)
awk 'BEGIN {RS="Cat"; FS="\n"; OFS=","} NR>1 {$1=RS; NF--; print}' file
答案4
该解决方案不需要将整个文件读入内存。换句话说:只要整行小于 1 GB,它就可以处理在 1 GB 计算机上处理的 1 TB 文件。
perl -ne 'BEGIN { $sep = shift; }
if(/^$sep$/o) { @p and print join(",", @p)."\n"; @p = (); }
chomp; push @p, $_;
END { print join(",", $sep, @p)."\n"; }' Cat /tmp/cat