我有一个巨大的 cvs 日志文件,其中清除了无用的信息,内容如下
Working file: unmodifiedfile1.c
================
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: unmodifiedfile2.h
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
Working file: unmodifiedfile3.h
我想清理与未修改文件相关的行:
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
要匹配的模式是
Working file: FILENAME
================
到目前为止我能做的如下:
sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt |
grep -v 'PLACEHOLDER===' |
sed 's/PLACEHOLDER/\n/
但我确信有一个更干净的解决方案,我的 sed 无知阻止了我......(另外,如果有必要的话,一个额外的好处是能够删除最新的行)
聚苯乙烯
输出结尾为:
================
Working file: unmodifiedfile3.h
也可以接受
答案1
sed
这应该接近您所追求的:
<cvslog sed -n '/Working file/ { N; /\n=\+$/b; :a; N; /\n=\+$/!ba; p; }'
输出:
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
解释
这是带有注释的相同sed
脚本:
/Working file/ {
N # append next line to pattern space
/\n=\+$/b # is it a file separator -> next file
:a
N # append next line to pattern space
/\n=\+$/!ba # isn't it a file separator -> read next line
p # otherwise print accumulated text
}
awk
如果您告诉awk
使用文件分隔符行作为记录分隔符 ( RS
),那么定义合理的选择标准就变得相当简单:
<cvslog awk 'NF>2' RS='\n=+\n' FS='\n' ORS='\n\n'
输出:
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
bash 和 coreutils
只是为了好玩:
csplit cvslog '/=\{16\}/1' '{*}'
wc -l xx* |
head -n-1 |
while read n f; do
if (( n > 2 )); then
cat $f
fi
done
输出:
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
答案2
sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt |
grep -v 'PLACEHOLDER===' |
sed 's/PLACEHOLDER/\n/
确实可以缩短为:
$ sed '/Working file:/{N;/===/d}' changelog.txt
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
Working file: unmodifiedfile3.h
- 删除包含
Working file:
以下行的所有行(如果包含)===
以及最后一行(如果包含)Working file:
感谢@ilkkachu 的建议。如果模式需要在行首匹配,请使用^Working file:
$ cat ip.txt
Working file: 123
================
Working file: f1
----------------------------------
revision 1.3
Fixed some bug
================
Working file: abc
================
Working file: file
----------------------------------
revision 1.1
Added some feature
================
Working file: xyz
$ sed '/Working file:/{N;/===/d}' ip.txt | sed '${/Working file:/d}'
Working file: f1
----------------------------------
revision 1.3
Fixed some bug
================
Working file: file
----------------------------------
revision 1.1
Added some feature
================