Perl 正则表达式无法使用十六进制或八进制匹配

Perl 正则表达式无法使用十六进制或八进制匹配

我需要使用 Perl 来处理包含大量不可打印字符(即不在 ASCII 表中可显示字符范围内)的数据文件。我尝试使用八进制代码编写正则表达式来表示我正在寻找的不可打印字符,但无法使它们匹配。因此,接下来我决定尝试使用字母“e”的八进制和十六进制代码,只是为了看看我的方法是否正确。我发现即使这样我的代码也不起作用。这是一个简单示例:

use strict;
use warnings;

my $string = "This is a test string";
print "My string before is \"$string\".\n";

#   The letter 'e' has a position of 101 in the ASCII collating sequence,
#   which is hex '65' and octal 0145.

$string =~ s/0x65//;
print "My string after trying the hex code is \"$string\".\n";

$string =~ s/\0145//;
print "My string after trying the octal code is \"$string\".\n";

输出如下:

My string before is "This is a test string".
My string after trying the hex code is "This is a test string".
My string after trying the octal code is "This is a test string".

显然,我没有使用八进制或十六进制表示法对搜索字符串的正则表达式进行正确的编码,但经过大量的网络搜索后,我还是找不到正确的方法。

答案1

好吧,我终于找到了解决方案。 Brian D Foy 在一篇文章中对此进行了描述——https://www.effectiveperlprogramming.com/2010/10/specify-any-character-by-its-octal-ordinal-value/。我确实错误地指定了十六进制代码和八进制代码。在找到 Brian 的文章后,我的程序现在被更正为以下内容:

use strict;
use warnings;

my $string = "This is a test string";
print "My string before is \"$string\".\n";

#   The letter 'e' has a position of 101 in the ASCII collating sequence,
#   which is hex '65' and octal 0145.

$string =~ s/\x65//;
print "My string after trying the hex code is \"$string\".\n";

$string =~ s/\145//;
print "My string after trying the octal code is \"$string\".\n";

经过这些修正,我的程序输出现在如下:

My string before is "This is a test string".
My string after trying the hex code is "This is a tst string".
My string after trying the octal code is "This is a tst string".

谢谢,Brian!

相关内容