替换 HTML 文件中的多行模式

Question 1

我会使用 Perl 来实现这个：

perl -0pe 's/<h1>Title.*\n.*<br>/replacement/' filename.html

这里，-0让 Perl 根据字符拆分记录NUL而不是逐行读取，这是使用该-p选项时的默认操作。

使用 Perl 正则表达式，您需要.* 多次匹配任何字符，并且使用匹配换行符\n。

例子：

$ echo '<body>
<h1>Title</h1><p>
<a href="url">Description</a><br>' | perl -0pe 's/<h1>Title.*\n.*<br>/replacement/'
<body>
replacement

Answer

我会使用 Perl 来实现这个：

perl -0pe 's/<h1>Title.*\n.*<br>/replacement/' filename.html

这里，-0让 Perl 根据字符拆分记录NUL而不是逐行读取，这是使用该-p选项时的默认操作。

使用 Perl 正则表达式，您需要.* 多次匹配任何字符，并且使用匹配换行符\n。

例子：

$ echo '<body>
<h1>Title</h1><p>
<a href="url">Description</a><br>' | perl -0pe 's/<h1>Title.*\n.*<br>/replacement/'
<body>
replacement

Question 2

sed无法直接匹配多行。当需要多行模式时，请使用更强大的工具，例如 Perl：

perl -i~ -ne 'if (/^<h1>Title/) {
                  $n = <>;
                  if ($n =~ /<br>$/) { print "Replacement\n" }
                  else { print "$_$n" }
              } else { print }'

Answer

sed无法直接匹配多行。当需要多行模式时，请使用更强大的工具，例如 Perl：

perl -i~ -ne 'if (/^<h1>Title/) {
                  $n = <>;
                  if ($n =~ /<br>$/) { print "Replacement\n" }
                  else { print "$_$n" }
              } else { print }'

Question 3

这可以用 sed 来完成。

sed -nf repl.sed filename.html

其中repl.sed包含：

# Must have one line loaded up before branching to rep.
# Processing will start this way.
:rep
# Load extra line into pattern space
N
# Test for title
/<h1>.*<\/h1><p>\n<a href=".*">.*<\/a><br>/{
  #Substitute and print
  s/<h1>\(.*\)<\/h1><p>\n<a href=".*">.*<\/a><br>/Title: \1/p
  #append next line without cycling
  N
  # everything but the last line
  s/.*\n\([.\n]*\)/\1/
  #test for last line
  ${
    p
    # this will effectively end the program
    n
  }
  b rep
}
${
  # will print pattern space (both lines)
  p
  # this will effectively end the program
  n
}
#Print first line in pattern space
P;
#Remove first line in pattern space with newline
s/.*\n\([.\n]*\)/\1/
b rep

看使用多条线路

Answer

这可以用 sed 来完成。

sed -nf repl.sed filename.html

其中repl.sed包含：

# Must have one line loaded up before branching to rep.
# Processing will start this way.
:rep
# Load extra line into pattern space
N
# Test for title
/<h1>.*<\/h1><p>\n<a href=".*">.*<\/a><br>/{
  #Substitute and print
  s/<h1>\(.*\)<\/h1><p>\n<a href=".*">.*<\/a><br>/Title: \1/p
  #append next line without cycling
  N
  # everything but the last line
  s/.*\n\([.\n]*\)/\1/
  #test for last line
  ${
    p
    # this will effectively end the program
    n
  }
  b rep
}
${
  # will print pattern space (both lines)
  p
  # this will effectively end the program
  n
}
#Print first line in pattern space
P;
#Remove first line in pattern space with newline
s/.*\n\([.\n]*\)/\1/
b rep

看使用多条线路

替换 HTML 文件中的多行模式

答案1

答案2

答案3

相关内容