如何使用 grep 或 sed 从 html 中过滤掉链接？

Question

尝试以下命令：

curl -s http://www.example.com | grep -Po '(?<=src=")[^"]*(jpg|png)'

解释：

从man grep：

   -o, --only-matching
          Print only the matched (non-empty) parts of a matching line,
          with each such part on a separate output line.
   -P, --perl-regexp
          Interpret PATTERN as a Perl compatible regular expression (PCRE)

后向(?<=src=)断言在字符串的当前位置，前面是字符src=。然后我们寻找除了"以 jpg 或 png 结尾的所有内容。

Answer 1

尝试以下命令：

curl -s http://www.example.com | grep -Po '(?<=src=")[^"]*(jpg|png)'

解释：

从man grep：

   -o, --only-matching
          Print only the matched (non-empty) parts of a matching line,
          with each such part on a separate output line.
   -P, --perl-regexp
          Interpret PATTERN as a Perl compatible regular expression (PCRE)

后向(?<=src=)断言在字符串的当前位置，前面是字符src=。然后我们寻找除了"以 jpg 或 png 结尾的所有内容。

如何使用 grep 或 sed 从 html 中过滤掉链接？

答案1

相关内容