在文件 find 和 grep 中搜索字符串的有效方法

Question 1

我能想到的最快的方法是使用来xargs分担负载：

find . -type f -print0  | xargs -0 grep -Fil "mypattern"

在包含 3631 个文件的目录上运行一些基准测试：

$ time find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \;

real    0m15.012s
user    0m4.876s
sys     0m1.876s

$ time find . -type f -exec grep -Fli "mystring" {} 2>/dev/null \;

real    0m13.982s
user    0m4.328s
sys     0m1.592s


$ time find . -type f -print0  | xargs -0 grep -Fil "mystring" >/dev/null 

real    0m3.565s
user    0m3.508s
sys     0m0.052s

您的其他选择是通过使用以下方法限制文件列表来简化操作find：

   -executable
          Matches files which are executable and  direc‐
          tories  which  are  searchable (in a file name
          resolution sense).  
   -writable
          Matches files which are writable.             

   -mtime n
          File's  data was last modified n*24 hours ago.
          See the comments for -atime to understand  how
          rounding  affects  the  interpretation of file
          modification times.
   -group gname
          File  belongs to group gname (numeric group ID
          allowed).
   -perm /mode
          Any  of  the  permission bits mode are set for
          the file.  Symbolic modes are accepted in this
          form.  You must specify `u', `g' or `o' if you
          use a symbolic mode. 
   -size n[cwbkMG]  <-- you can set a minimum or maximum size
          File uses n units  of  space.

或者通过调整grep：

您已经使用了grep的-l选项，它会导致打印文件名，更重要的是，在第一个匹配处停止：

   -l, --files-with-matches
       Suppress normal output; instead print the name of each input file  from
       which  output would normally have been printed.  The scanning will stop
       on the first match.  (-l is specified by POSIX.)

我能想到的唯一其他加快速度的方法是通过使用该-F选项确保您的模式不会被解释为正则表达式（如@suspectus 所建议的）。

Answer

我能想到的最快的方法是使用来xargs分担负载：

find . -type f -print0  | xargs -0 grep -Fil "mypattern"