将不同行上的单词合并为一行

Question 1

$ awk -v RS= '{$1=$1}1' file
These are the words for sentence 1
these are the words for sentence 2

Answer

$ awk -v RS= '{$1=$1}1' file
These are the words for sentence 1
these are the words for sentence 2

Question 2

使用 awk：

awk 'BEGIN { RS = "" } {gsub(/ *\n */, " "); print}' FILE

Answer

使用 awk：

awk 'BEGIN { RS = "" } {gsub(/ *\n */, " "); print}' FILE

Question 3

扩展正则表达式模式下的 GNU sed 编辑器并使用保留空间来存储非空行。

sed -Ee 's/^\s+|\s+$//g
  /./{H;$!d;}
  x;s/.//;y/\n/ /
' file

另一种方法是使用 awk 保留字：

awk -v RS= '
BEGIN{FS=ORS}
{$1=$1}1
' file

Answer

扩展正则表达式模式下的 GNU sed 编辑器并使用保留空间来存储非空行。

sed -Ee 's/^\s+|\s+$//g
  /./{H;$!d;}
  x;s/.//;y/\n/ /
' file

另一种方法是使用 awk 保留字：

awk -v RS= '
BEGIN{FS=ORS}
{$1=$1}1
' file

Question 4

$ perl -00 -aE 'say join " ", @F' input.txt 
These are the words for sentence 1
these are the words for sentence 2

-00告诉 perl 以段落模式读取文件（段落由一个或多个空行分隔）。
-a告诉 perl 将空白处的输入自动拆分为数组@F（类似于 awk 如何将其输入自动拆分为 $1、$2、$3 等）。

-a还隐式设置该-n选项，这使得 perl 的行为类似于sed -n（读取所有输入，而不自动打印它）。通过将-p选项添加到命令行，可以覆盖此选项（自动打印可能修改的输入，例如不带 -n 的 sed）。
-E启用脚本的所有可选功能 - 例如say在打印后自动附加换行符的功能...稍微简单一点（如果您使用而不是，print join(" ", @F), "\n"则必须执行此操作）。-e-E

sayPerl 已经存在很长时间了，可以说应该默认启用，但 Perl 开发人员几十年前就决定不这样做，因为存在破坏定义自己say函数的旧脚本的风险。
该join()函数将数组元素连接起来，@F元素之间用空格连接。

或者，您可以设置输出字段分隔符 ( $,) 而不使用join：

$ perl -00 -aE 'BEGIN{$,=" "}; say @F' input.txt 
These are the words for sentence 1
these are the words for sentence 2

与 awk 不同，默认的 OFS 是空格字符，而 perl 中的默认 OFS 是空的，未定义的。这将打印数组，单词之间没有任何空格：

$ perl -00 -aE 'say @F' input.txt 
Thesearethewordsforsentence1
thesearethewordsforsentence2

不完全是你想要的。

Answer

$ perl -00 -aE 'say join " ", @F' input.txt 
These are the words for sentence 1
these are the words for sentence 2

-00告诉 perl 以段落模式读取文件（段落由一个或多个空行分隔）。
-a告诉 perl 将空白处的输入自动拆分为数组@F（类似于 awk 如何将其输入自动拆分为 $1、$2、$3 等）。

-a还隐式设置该-n选项，这使得 perl 的行为类似于sed -n（读取所有输入，而不自动打印它）。通过将-p选项添加到命令行，可以覆盖此选项（自动打印可能修改的输入，例如不带 -n 的 sed）。
-E启用脚本的所有可选功能 - 例如say在打印后自动附加换行符的功能...稍微简单一点（如果您使用而不是，print join(" ", @F), "\n"则必须执行此操作）。-e-E

sayPerl 已经存在很长时间了，可以说应该默认启用，但 Perl 开发人员几十年前就决定不这样做，因为存在破坏定义自己say函数的旧脚本的风险。
该join()函数将数组元素连接起来，@F元素之间用空格连接。

或者，您可以设置输出字段分隔符 ( $,) 而不使用join：

$ perl -00 -aE 'BEGIN{$,=" "}; say @F' input.txt 
These are the words for sentence 1
these are the words for sentence 2

与 awk 不同，默认的 OFS 是空格字符，而 perl 中的默认 OFS 是空的，未定义的。这将打印数组，单词之间没有任何空格：

$ perl -00 -aE 'say @F' input.txt 
Thesearethewordsforsentence1
thesearethewordsforsentence2

不完全是你想要的。

相关内容