我正在一个大小约为 13G 的文件上运行 grep。它正在回归
Binary file file.xml matches
我没想到这一点,我认为它会返回带有我的字符串的每一行,以便我可以运行以下命令,
grep "searchString" ./file.xml | wc -l
并返回我的大文件中所有出现的 searchString 的计数。
答案1
看起来 grep 认为您的 XML 文件是二进制文件而不是文本文件。
如果你想强制 grep 将文件视为文本而不考虑内容,你可以使用它的--text
开关(假设是 GNU grep),如下所示:
grep --text "searchString" ./file.xml | wc -l
请注意,如果您只想对匹配项进行计数,那么最好使用grep --count
而不是通过管道wc -l
,从而节省管道和进程调用。
答案2
看起来文件开头有一些不常见的符号并将grep
其检测为二进制。您可以尝试--binary-files=text
选项。
grep --binary-files=text "searchString" file.xml | wc -l
从手册页:
--binary-files=TYPE
If the first few bytes of a file indicate that the file contains
binary data, assume that the file is of type TYPE. By default,
TYPE is binary, and grep normally outputs either a one-line
message saying that a binary file matches, or no message if
there is no match. If TYPE is without-match, grep assumes that
a binary file does not match; this is equivalent to the -I
option. If TYPE is text, grep processes a binary file as if it
were text; this is equivalent to the -a option. Warning: grep
--binary-files=text might output binary garbage, which can have
nasty side effects if the output is a terminal and if the
terminal driver interprets some of it as commands.
答案3
看起来你在使用时犯了一个错误./file.xml
。如果你试试:
grep "searchString" file.xml | wc -l
它有什么问题吗?