sed 段落标签

Question 1

正如您最初要求的sed解决方案，我附加一个：

sed '/./{H;1h;$! d}
g;/{p}$/d
s#^{p}.*#&\n{/p}#;p
s/.*/{p}/;h;d' somefile.txt

解释

第 1 行：将非空行附加到保持缓冲区（复制而不是附加第一行以避免以换行符开头）。继续处理空行或文件末尾。
第 2 行：忽略没有文本的缓冲区，以处理多个空行或缓冲区末尾的空行
第 3 行：如果有开始标记，请添加结束标记。然后打印。
第 4 行：用新的开始标记填充保持缓冲区。

Answer

正如您最初要求的sed解决方案，我附加一个：

sed '/./{H;1h;$! d}
g;/{p}$/d
s#^{p}.*#&\n{/p}#;p
s/.*/{p}/;h;d' somefile.txt

解释

第 1 行：将非空行附加到保持缓冲区（复制而不是附加第一行以避免以换行符开头）。继续处理空行或文件末尾。
第 2 行：忽略没有文本的缓冲区，以处理多个空行或缓冲区末尾的空行
第 3 行：如果有开始标记，请添加结束标记。然后打印。
第 4 行：用新的开始标记填充保持缓冲区。

Question 2

我会建议awk方法：

awk 'NR>1 && NF{$0="{p}" RS $0 RS "{/p}"}1' file

输出：

Section 5. General Information About Project Gutenberg-tm electronic works.

{p}
Description
{/p}

{p}
Professor Michael S. Hart is the originator of the Project Gutenberg-tm concept of a library of electronic works that could be freely shared with anyone. For thirty years, he produced and distributed Project Gutenberg-tm eBooks with only a loose network of volunteer support.
{/p}

{p}
Project Gutenberg-tm eBooks are often created from several printed editions, all of which are confirmed as Public Domain in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition.
{/p}

RS- awk的记录分隔符，默认为换行符\n

NR>1- 跳过第一个标头线

NF- 指向该行的字段总数（考虑非空行）

Answer

我会建议awk方法：

awk 'NR>1 && NF{$0="{p}" RS $0 RS "{/p}"}1' file

输出：

Section 5. General Information About Project Gutenberg-tm electronic works.

{p}
Description
{/p}

{p}
Professor Michael S. Hart is the originator of the Project Gutenberg-tm concept of a library of electronic works that could be freely shared with anyone. For thirty years, he produced and distributed Project Gutenberg-tm eBooks with only a loose network of volunteer support.
{/p}

{p}
Project Gutenberg-tm eBooks are often created from several printed editions, all of which are confirmed as Public Domain in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition.
{/p}

RS- awk的记录分隔符，默认为换行符\n

NR>1- 跳过第一个标头线

NF- 指向该行的字段总数（考虑非空行）

sed 段落标签

答案1

答案2

相关内容