当下一行以一组定义的字符开始时，如何用固定文本替换结束行？

Question 1

使用 sed：

sed -z -e 's/\n#S/ #S/g' -e 's/\nN /N /g' data

慢动作：

-z使 sed 将文件视为一行（因此行尾是纯字符）
's/\n#S/#S/g'#S用空格替换 a 之前的所有 LF
-e 's/\nN /N /g'替换之前的所有 LF N（即空白行）

Answer

使用 sed：

sed -z -e 's/\n#S/ #S/g' -e 's/\nN /N /g' data

慢动作：

-z使 sed 将文件视为一行（因此行尾是纯字符）
's/\n#S/#S/g'#S用空格替换 a 之前的所有 LF
-e 's/\nN /N /g'替换之前的所有 LF N（即空白行）

Question 2

使用paste（这要求始终有 4 行一组）：

 paste -s -d '   \n' data

慢动作：

paste -s将文件中的行连接起来
-d指定要插入为分隔符的字符。当有多个字符时，它们将以循环方式使用，因此有 3 个空格和一个 LF：
- 第一个空格用于第一个拼接（Nto #S），
- 第二个空格用于第二个拼接（#Sto #S），
- 第三个空格用于第三次拼接（#S空白行），
- 最后一个分隔符 LF 用于第四个拼接（空行至N）
- 并在接下来的 4 行中重复该循环。

Answer

使用paste（这要求始终有 4 行一组）：

 paste -s -d '   \n' data

慢动作：

paste -s将文件中的行连接起来
-d指定要插入为分隔符的字符。当有多个字符时，它们将以循环方式使用，因此有 3 个空格和一个 LF：
- 第一个空格用于第一个拼接（Nto #S），
- 第二个空格用于第二个拼接（#Sto #S），
- 第三个空格用于第三次拼接（#S空白行），
- 最后一个分隔符 LF 用于第四个拼接（空行至N）
- 并在接下来的 4 行中重复该循环。

Question 3

这是一个便携的解决方案POSIXsed，实施以下规则：

空行应被删除；
任何以开头的行#S都应与前一个非空行合并，并且它们之间有一个空格字符，除非没有前一个非空行。

代码：

<data sed '/^$/ d; :start; N; s/\n$//; t start; s/\n#S/ #S/; t start; P; D'

与评论相同（仍然有效的代码）：

<data sed '
  /^$/ d      # If empty line read, delete it and start a new cycle.
  :start      # A label.
  N           # Read additional line, there are now two lines in the pattern space.
  s/\n$//     # If the second line is empty, replace the newline with nothing.
  t start     # If the above replacement occurred, go to start (to add another line).
              # Otherwise
  s/\n#S/ #S/ # if the second line starts with #S, replace the newline with space.
  t start     # If the above replacement occurred, go to start (to add another line).
              # Otherwise
              # (i.e when non-empty line not starting with #S occurred)
  P           # print the pattern space up to the first newline and...
  D           # delete the initial segment of the pattern space
              # through the first newline (i.e. everything just printed),
              # and start the next cycle with the resultant pattern space
              # and without reading any new input
              # (in our case the new input will be explicitly read by N then).
  '

请注意，该解决方案使用sed模式空间来累积许多输入行。此注释适用：

模式和保持空间各自应能够容纳至少 8192 个字节。

在P命令之前，模式空间保存一行（相对较长）的待打印行和一行（相对较短）的输入行，以及中间的换行符。显然，这取决于您的数据，这种结构是否在某个时候超过 8192 字节。如果超过，某些sed实现可能会失败。

Answer

这是一个便携的解决方案POSIXsed，实施以下规则：

空行应被删除；
任何以开头的行#S都应与前一个非空行合并，并且它们之间有一个空格字符，除非没有前一个非空行。

代码：

<data sed '/^$/ d; :start; N; s/\n$//; t start; s/\n#S/ #S/; t start; P; D'

与评论相同（仍然有效的代码）：

<data sed '
  /^$/ d      # If empty line read, delete it and start a new cycle.
  :start      # A label.
  N           # Read additional line, there are now two lines in the pattern space.
  s/\n$//     # If the second line is empty, replace the newline with nothing.
  t start     # If the above replacement occurred, go to start (to add another line).
              # Otherwise
  s/\n#S/ #S/ # if the second line starts with #S, replace the newline with space.
  t start     # If the above replacement occurred, go to start (to add another line).
              # Otherwise
              # (i.e when non-empty line not starting with #S occurred)
  P           # print the pattern space up to the first newline and...
  D           # delete the initial segment of the pattern space
              # through the first newline (i.e. everything just printed),
              # and start the next cycle with the resultant pattern space
              # and without reading any new input
              # (in our case the new input will be explicitly read by N then).
  '

请注意，该解决方案使用sed模式空间来累积许多输入行。此注释适用：

模式和保持空间各自应能够容纳至少 8192 个字节。

在P命令之前，模式空间保存一行（相对较长）的待打印行和一行（相对较短）的输入行，以及中间的换行符。显然，这取决于您的数据，这种结构是否在某个时候超过 8192 字节。如果超过，某些sed实现可能会失败。

Question 4

awk（呆呆地 ^[1]）

与通常情况不同，sed您可以使用awk（并且以多种不同的方式......）

awk 'ORS=" "; NR % 4 == 0 && ORS="\n" ' data

在哪里

ORS=" "将输出记录分隔符（默认为换行符）固定为空格（您可以更改）
NR % 4 == 0 && ORS="\n"每四行它都会回到换行符\n
如果没有指定其他内容，awk则打印整行
data是您的数据文件。

如果您愿意，您可以使用正则表达式sed（以类似的方式）。

A格式检查带 awk 的版本

即使没有请求，你也可能想要管理截断的文件消除损坏的输出线和产生错误以及错误信息。

awk '{a=$0; getline b; getline c; 
     if ( getline > 0 ) {print a, b, c, $0 } 
     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

在哪里

a=$0;将整行放入变量中a
getline b;读取一行并放入变量b
getline c;晦涩难懂的命令:-)
if (getline)如果它能够读取一行……
..........{print a, b, c, $0} 打印 4 行
else 在 stderr 设备（屏幕或其他）上打印错误，您可以在此处自定义...
exit 65返回非 0 的退出代码--->error

奖金：为什么是65？

寻找适合您的退出代码 ^[2]你可能会发现它被建议在/usr/include/sysexits.h一些 C 标准中看到……

  #define EX_DATAERR      65      /* data format error */

65 最适合用于数据格式错误......

说实话我更喜欢这个答案四十二，
但每个值都不为零（并且不保留^[2]）可能不错，并且 65 是具体的数字……

Answer

awk（呆呆地 ^[1]）

与通常情况不同，sed您可以使用awk（并且以多种不同的方式......）

awk 'ORS=" "; NR % 4 == 0 && ORS="\n" ' data

在哪里

ORS=" "将输出记录分隔符（默认为换行符）固定为空格（您可以更改）
NR % 4 == 0 && ORS="\n"每四行它都会回到换行符\n
如果没有指定其他内容，awk则打印整行
data是您的数据文件。

如果您愿意，您可以使用正则表达式sed（以类似的方式）。

A格式检查带 awk 的版本

即使没有请求，你也可能想要管理截断的文件消除损坏的输出线和产生错误以及错误信息。

awk '{a=$0; getline b; getline c; 
     if ( getline > 0 ) {print a, b, c, $0 } 
     else { print "Ohi " > "/dev/stderr" ; exit 65; }  }' data

在哪里

a=$0;将整行放入变量中a
getline b;读取一行并放入变量b
getline c;晦涩难懂的命令:-)
if (getline)如果它能够读取一行……
..........{print a, b, c, $0} 打印 4 行
else 在 stderr 设备（屏幕或其他）上打印错误，您可以在此处自定义...
exit 65返回非 0 的退出代码--->error

奖金：为什么是65？

寻找适合您的退出代码 ^[2]你可能会发现它被建议在/usr/include/sysexits.h一些 C 标准中看到……

  #define EX_DATAERR      65      /* data format error */

65 最适合用于数据格式错误......

说实话我更喜欢这个答案四十二，
但每个值都不为零（并且不保留^[2]）可能不错，并且 65 是具体的数字……

当下一行以一组定义的字符开始时，如何用固定文本替换结束行？

答案1

答案2

答案3

答案4

awk（呆呆地 ^[1]）

A格式检查带 awk 的版本

奖金：为什么是65？

相关内容

答案1

答案2

答案3

答案4

awk（呆呆地 [1]）

A格式检查带 awk 的版本

奖金：为什么是65？

相关内容

awk（呆呆地 ^[1]）