如何用 sed 替换除特定模式之外的所有内容？

Question 1

对于所提供的给定输入，此sed表达式似乎可以满足您的要求：

$ cat input
`>TRINITY_DN75270_c3_g2::TRINITY_DN75270_c3_g2_i4::g.22702::m.22702 [sample]`
$ sed 's/^.*::\([A-Z_0-9a-z]*\)::.*\[\(.*\)\].*/\1[\2]/' input
TRINITY_DN75270_c3_g2_i4[sample]

神奇之处在于使用正则表达式组和两个反向引用来重建所需的输出。阐述：

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
  ::                       '::'
  \(                       group and capture to \1:
    [A-Z_0-9a-z]*            any character of: 'A' to 'Z', '_', '0'
                             to '9', 'a' to 'z' (0 or more times
                             (matching the most amount possible))
  \)                       end of \1
  ::                       '::'
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
  \[                       '['
  (                        group and capture to \2:
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
  )                        end of \2
  \]                       ']'
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))

这\1是您想要提取的第一个键，也是\2后面方括号中的内容。然后通过重建 Is \1[\2]/，创建您想要的输出。

Answer

对于所提供的给定输入，此sed表达式似乎可以满足您的要求：

$ cat input
`>TRINITY_DN75270_c3_g2::TRINITY_DN75270_c3_g2_i4::g.22702::m.22702 [sample]`
$ sed 's/^.*::\([A-Z_0-9a-z]*\)::.*\[\(.*\)\].*/\1[\2]/' input
TRINITY_DN75270_c3_g2_i4[sample]

神奇之处在于使用正则表达式组和两个反向引用来重建所需的输出。阐述：

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
  ::                       '::'
  \(                       group and capture to \1:
    [A-Z_0-9a-z]*            any character of: 'A' to 'Z', '_', '0'
                             to '9', 'a' to 'z' (0 or more times
                             (matching the most amount possible))
  \)                       end of \1
  ::                       '::'
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
  \[                       '['
  (                        group and capture to \2:
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
  )                        end of \2
  \]                       ']'
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))

这\1是您想要提取的第一个键，也是\2后面方括号中的内容。然后通过重建 Is \1[\2]/，创建您想要的输出。

Question 2

awk选择：

awk -F'::' '{ match($NF,/\[.+\]/); print $2 substr($NF,RSTART,RLENGTH) }' file

输出：

TRINITY_DN75270_c3_g2_i4[sample]

-F'::'- 考虑::作为字段分隔符

Answer

awk选择：

awk -F'::' '{ match($NF,/\[.+\]/); print $2 substr($NF,RSTART,RLENGTH) }' file

输出：

TRINITY_DN75270_c3_g2_i4[sample]

-F'::'- 考虑::作为字段分隔符

Question 3

sed -e '
   s/::/\n/; s//\n/
   s/.*\n\(.*\)\n.*\(\[[^]]*]\).*/\1\2/
' data

::我们通过替换出现的第 1 次和第 2 次来标记 ID 。然后我们去掉除标记区域 + [...] 区域之外的所有内容

结果：

TRINITY_DN75270_c3_g2_i4[sample]

Answer

sed -e '
   s/::/\n/; s//\n/
   s/.*\n\(.*\)\n.*\(\[[^]]*]\).*/\1\2/
' data

::我们通过替换出现的第 1 次和第 2 次来标记 ID 。然后我们去掉除标记区域 + [...] 区域之外的所有内容

结果：

TRINITY_DN75270_c3_g2_i4[sample]

Question 4

假设您想保留::分隔符 +之间的第二个字段[sample]，因此删除该字段之前和之后的所有内容，直到最后一个space您可以：

sed 's/^[^:]*::\([^:]*\)::.* /\1/'

这将从行的开头匹配到最后一个space（.*是“贪婪”），并替换它只是与第一个“子表达式”（用转义括号标记）。

有关反向引用和子表达式的更多详细信息，请参阅gnu.org 上的这个描述。

Answer

假设您想保留::分隔符 +之间的第二个字段[sample]，因此删除该字段之前和之后的所有内容，直到最后一个space您可以：

sed 's/^[^:]*::\([^:]*\)::.* /\1/'

这将从行的开头匹配到最后一个space（.*是“贪婪”），并替换它只是与第一个“子表达式”（用转义括号标记）。

有关反向引用和子表达式的更多详细信息，请参阅gnu.org 上的这个描述。

如何用 sed 替换除特定模式之外的所有内容？

答案1

答案2

答案3

答案4

相关内容