如何使用 sed 解开段落

2024-5-15 • tag-icon

如何使用 sed 解开段落

我有一个文本文件，其中包含以 76 个字符换行的段落，段落之间有一个空行。我想使用 sed 将其转换为每个段落一行、没有空行的文件。

输入示例：

Start of p1
continued p1
end of p1.

Start of p2
end of p2.

Start of p3
continued p3
even more p3
end of p3.

输出示例：

Start of p1 continued p1 end of p1.
Start of p2 end of p2.
Start of p3 continued p3 even more p3 end of p3.

答案1

使用 GNU sed

$ sed ':a;N;$!{/\n$/!ba}; s/[[:blank:]]*\n[[:blank:]]*/ /g' textfile
Start of p1 continued p1 end of p1. 
Start of p2 end of p2. 
Start of p3 continued p3 even more p3 end of p3.

怎么运行的

:a

这定义了标签a
N

这将读取下一行并将其与换行符一起附加到当前行。
$!{/\n$/!ba}

如果 (a) 我们不在文件末尾和(b) 当前行不为空，则跳转（分支）回标签a
s/[[:blank:]]*\n[[:blank:]]*/ /g'

如果我们到达这里，我们在模式空间中有一个完整的段落。查找所有换行符（可以选择前面或后面有空格），并将其替换为空格。

使用 BSD/OSX sed

尝试（未经测试）：

sed -e :a -e 'N;$!{/\n$/!ba' -e '}' -e 's/[[:blank:]]*\n[[:blank:]]*/ /g' textfile

使用 awk

$ awk '{printf "%s ",$0} /^$/{print ""} END{print ""}' text
Start of p1 continued p1 end of p1.  
Start of p2 end of p2.  
Start of p3 continued p3 even more p3 end of p3.

相关内容