基于正则表达式连接多行

Question 1

使用 GNU awk 处理多字符RS、RT、gensub()和，\s并且无需将整个文件一次性读入内存：

$ awk -v RS='\\s*</?blockquote>\\s*' '{ORS=gensub(/\s+/,"","g",RT)} 1' file
foo

bar<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>baz

Answer

使用 GNU awk 处理多字符RS、RT、gensub()和，\s并且无需将整个文件一次性读入内存：

$ awk -v RS='\\s*</?blockquote>\\s*' '{ORS=gensub(/\s+/,"","g",RT)} 1' file
foo

bar<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>baz

Question 2

使用 aPerl的单行：

>= 5.36:

$ perl -gpe 's/(\w+)\n\n(</?blockquote\b[^\n]+)\s*\n/$1$2/g' file

或者< 5.36：

$ perl -0777 -pe 's/(\w+)\n\n(</?blockquote\b[^\n]+)\s*\n/$1$2/g' file

foo<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>bar

-g或-0777读取内存中的整个文件
's///'是替换骨架，就像sed
$1$2是两个被捕获的组，\1\2就像sed

正则表达式匹配如下：

节点	解释
`(`	分组并捕获到 $1：
`\w+`	单词字符（az、AZ、0-9、_）（1 次或多次（匹配尽可能多的数量））
`)`	1 美元结束
`\n`	'\n'（换行符）
`\n`	'\n'（换行符）
`(`	分组并捕获到 $2：
`</?blockquote`	'<' + 可选的 '/' + '块引用'
`\b`	词边界锚
`[^\n]+`	任何字符，除了： '\n'（换行符）（1 次或多次（匹配尽可能多的数量））
`)`	2 美元结束
`\s*`	空格（\n、\r、\t、\f 和 " "）（0 次或多次（匹配尽可能多的数量））
`\n`	'\n'（换行符）

Answer

使用 aPerl的单行：

>= 5.36:

$ perl -gpe 's/(\w+)\n\n(</?blockquote\b[^\n]+)\s*\n/$1$2/g' file

或者< 5.36：

$ perl -0777 -pe 's/(\w+)\n\n(</?blockquote\b[^\n]+)\s*\n/$1$2/g' file

foo<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>bar

-g或-0777读取内存中的整个文件
's///'是替换骨架，就像sed
$1$2是两个被捕获的组，\1\2就像sed

正则表达式匹配如下：

节点	解释
`(`	分组并捕获到 $1：
`\w+`	单词字符（az、AZ、0-9、_）（1 次或多次（匹配尽可能多的数量））
`)`	1 美元结束
`\n`	'\n'（换行符）
`\n`	'\n'（换行符）
`(`	分组并捕获到 $2：
`</?blockquote`	'<' + 可选的 '/' + '块引用'
`\b`	词边界锚
`[^\n]+`	任何字符，除了： '\n'（换行符）（1 次或多次（匹配尽可能多的数量））
`)`	2 美元结束
`\s*`	空格（\n、\r、\t、\f 和 " "）（0 次或多次（匹配尽可能多的数量））
`\n`	'\n'（换行符）

Question 3

awk 'BEGIN { waiting_for_tag=1; };
     NF==0 { next; };
     $1 ~ "</?blockquote>" { printf "%s",$1; waiting_for_tag=0; next; };
     waiting_for_tag==1 { printf "%s",$0; next; }; 
     { printf "%s\n",$0; waiting_for_tag=1; }' input
foo<blockquote>That's one small step for man, one giant leap for mankind
A new line and another quote</blockquote>bar

Answer

awk 'BEGIN { waiting_for_tag=1; };
     NF==0 { next; };
     $1 ~ "</?blockquote>" { printf "%s",$1; waiting_for_tag=0; next; };
     waiting_for_tag==1 { printf "%s",$0; next; }; 
     { printf "%s\n",$0; waiting_for_tag=1; }' input
foo<blockquote>That's one small step for man, one giant leap for mankind
A new line and another quote</blockquote>bar

基于正则表达式连接多行

答案1

答案2

正则表达式匹配如下：

答案3

相关内容