让 grep 理解字节转义

让 grep 理解字节转义

我正在尝试匹配一些 UTF-8 字符。问题是grep不转换\x字节转义,因此失败:

echo -e '\xd8\xaa' | grep -P '\xd8\xaa'

当这成功时:

echo -e '\xd8\xaa' | grep -P $(printf '\xd8\xaa')

grep 可以不使用 printf 直接理解字节转义吗?如何?

答案1

这失败了:

$ echo -e '\xd8\xaa' | grep -P '\xd8\xaa' | hexdump

这成功了:

$ echo -e '\xd8\xaa' | grep -P $'\xd8\xaa' | hexdump
0000000 aad8 000a                              
0000003

文档

man bash

$'string' 形式的单词会被特殊处理。该单词扩展为字符串,并按照 ANSI C 标准指定的方式替换反斜杠转义字符。反斜杠转义序列(如果存在)按如下方式解码:

          \a     alert (bell)
          \b     backspace
          \e
          \E     an escape character
          \f     form feed
          \n     new line
          \r     carriage return
          \t     horizontal tab
          \v     vertical tab
          \\     backslash
          \'     single quote
          \"     double quote
          \?     question mark
          \nnn   the eight-bit character whose value is the octal value nnn (one to three digits)
          \xHH   the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)
          \uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex digits)
          \UHHHHHHHH
                 the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHHHHHH (one to eight hex digits)
          \cx    a control-x character

扩展结果是单引号的,就好像美元符号不存在一样。

相关内容