Perl 脚本使用正则表达式提取网页链接

Question

不要使用正则表达式来解析 HTML，特别是因为使用 Perl 更容易做到正确。例如：

#!/usr/bin/env perl

use strict;
use warnings;

use HTML::LinkExtor;

my ( @web, $fn, $p );

sub cb {
    my ( undef, %links ) = @_;
    push @web, values %links;
}

$p = HTML::LinkExtor->new( \&cb );
while ( $fn = shift ) {
    $p->parse_file($fn);
    $p->eof;
}

print "$_\n" for (@web);

Answer 1

不要使用正则表达式来解析 HTML，特别是因为使用 Perl 更容易做到正确。例如：

#!/usr/bin/env perl

use strict;
use warnings;

use HTML::LinkExtor;

my ( @web, $fn, $p );

sub cb {
    my ( undef, %links ) = @_;
    push @web, values %links;
}

$p = HTML::LinkExtor->new( \&cb );
while ( $fn = shift ) {
    $p->parse_file($fn);
    $p->eof;
}

print "$_\n" for (@web);

Perl 脚本使用正则表达式提取网页链接

答案1

相关内容