问题
为什么我的 grep 命令匹配某些行然后因此错误而停止 grep: /var/log/apache2/modsec_audit.log: 二进制文件匹配
我的 grep 命令:grep '^[' /var/log/apache2/modsec_audit.log
我猜日志文件中有二进制内容,这会弄乱 grep?但这只是我的猜测,所以请解释一下原因。另外请解释一下我可以做些什么来解决这个问题。
日志文件本身是 ASCII 文本。我可以使用 less 来读取文件 /var/log/apache2/modsec_audit.log /var/log/apache2/modsec_audit.log: HTML 文档,ASCII 文本,行很长 (9938)
设置
在我的网络服务器上,我安装了 modsec。审计文件是一个多行记录,描述如下: https://github.com/SpiderLabs/ModSecurity/wiki/ModSecurity-2-Data-Formats#user-content-Parts
仅供参考,我正在记录这些部分 SecAuditLogParts ABCEFHJZ
A 部分是审计日志头,只有一行,上面有以下信息:时间戳、唯一事务 ID、源 IP 地址(IPv4 或 IPv6)、源端口、目标 IP 地址(IPv4 或 IPv6)、目标端口
例如 [05/Jan/2024:00:45:31.734758 +0000] ZZdRKyjPxuLDuK2XVhEfLgAAAAU 198.12.243.17 13914 192.168.2.143 443
我正在尝试做什么
我试图从众多行中 grep 出 A 部分,并将其用作简单参考,了解某一天发生了什么以及机器人活动最繁忙的时间。有很多更好的方法可以做到这一点,但请保持正轨 - grep 出了什么问题,我该如何克服它?
答案1
grep
默认情况下不喜欢输出二进制数据(例如,输出二进制数据可能会弄乱终端),因此它默认仅指示binary file matches
二进制文件的匹配。
如果您无论如何都想要输出,您可能需要该-a
选项。
有关详细信息,请参阅手册中的相关部分:
-a, --text
Process a binary file as if it were text; this is equivalent to the --binary-files=text option.
--binary-files=TYPE
If a file's data or metadata indicate that the file contains binary data, assume that the file is of
type TYPE. Non-text bytes indicate binary data; these are either output bytes that are improperly
encoded for the current locale, or null input bytes when the -z option is not given.
By default, TYPE is binary, and grep suppresses output after null input binary data is discovered, and
suppresses output lines that contain improperly encoded data. When some output is suppressed, grep
follows any output with a message to standard error saying that a binary file matches.
If TYPE is without-match, when grep discovers null input binary data it assumes that the rest of the
file does not match; this is equivalent to the -I option.
If TYPE is text, grep processes a binary file as if it were text; this is equivalent to the -a option.
When type is binary, grep may treat non-text bytes as line terminators even without the -z option.
This means choosing binary versus text can affect whether a pattern matches a file. For example, when
type is binary the pattern q$ might match q immediately followed by a null byte, even though this is
not matched when type is text. Conversely, when type is binary the pattern . (period) might not match
a null byte.
Warning: The -a option might output binary garbage, which can have nasty side effects if the output is
a terminal and if the terminal driver interprets some of it as commands. On the other hand, when
reading files whose text encodings are unknown, it can be helpful to use -a or to set LC_ALL='C' in the
environment, in order to find more matches even if the matches are unsafe for direct display.