sed

sed

我有一个巨大的 cvs 日志文件,其中清除了无用的信息,内容如下

Working file: unmodifiedfile1.c
================
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: unmodifiedfile2.h
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
Working file: unmodifiedfile3.h

我想清理与未修改文件相关的行:

Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================

要匹配的模式是

Working file: FILENAME
================

到目前为止我能做的如下:

sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt |
grep -v 'PLACEHOLDER===' |
sed 's/PLACEHOLDER/\n/ 

但我确信有一个更干净的解决方案,我的 sed 无知阻止了我......(另外,如果有必要的话,一个额外的好处是能够删除最新的行)

聚苯乙烯

输出结尾为:

================
Working file: unmodifiedfile3.h

也可以接受

答案1

sed

这应该接近您所追求的:

<cvslog sed -n '/Working file/ { N; /\n=\+$/b; :a; N; /\n=\+$/!ba; p; }'

输出:

Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================

解释

这是带有注释的相同sed脚本:

/Working file/ {
  N                 # append next line to pattern space
  /\n=\+$/b         # is it a file separator -> next file
  :a
  N                 # append next line to pattern space
  /\n=\+$/!ba       # isn't it a file separator -> read next line
  p                 # otherwise print accumulated text
}

awk

如果您告诉awk使用文件分隔符行作为记录分隔符 ( RS),那么定义合理的选择标准就变得相当简单:

<cvslog awk 'NF>2' RS='\n=+\n' FS='\n' ORS='\n\n'

输出:

Working file: modifiedfile1.h  
----------------------------------
revision 1.3
Fixed some bug

Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature

bash 和 coreutils

只是为了好玩:

csplit cvslog '/=\{16\}/1' '{*}'
wc -l xx* | 
head -n-1 | 
while read n f; do 
  if (( n > 2 )); then 
    cat $f
  fi
done

输出:

Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================

答案2

sed '/Working file:/ N ; s/\n/PLACEHOLDER/' changelog.txt |
grep -v 'PLACEHOLDER===' |
sed 's/PLACEHOLDER/\n/ 

确实可以缩短为:

$ sed '/Working file:/{N;/===/d}' changelog.txt 
Working file: modifiedfile1.h
----------------------------------
revision 1.3
Fixed some bug
================
Working file: modifiedfile2.h
----------------------------------
revision 1.1
Added some feature
================
Working file: unmodifiedfile3.h


  • 删除包含Working file:以下行的所有行(如果包含)===以及最后一行(如果包含)Working file:

感谢@ilkkachu 的建议。如果模式需要在行首匹配,请使用^Working file:

$ cat ip.txt 
Working file: 123
================
Working file: f1
----------------------------------
revision 1.3
Fixed some bug
================
Working file: abc
================
Working file: file
----------------------------------
revision 1.1
Added some feature
================
Working file: xyz

$ sed '/Working file:/{N;/===/d}' ip.txt | sed '${/Working file:/d}' 
Working file: f1
----------------------------------
revision 1.3
Fixed some bug
================
Working file: file
----------------------------------
revision 1.1
Added some feature
================

相关内容