使 `xargs` 使用 N 个参数的最大倍数

Question 1

这是错误的方法，如果重点是查找名称具有这些 ID 之一作为其空格分隔单词之一的所有文件，那么您可以这样做：

find /dir -type f -print0 |
  gawk '
    !ids_processed {ids[$0]; next}
    {
      n = split(tolower($NF), words, " ")
      for (i = 1; i <= n; i++)
        if (words[i] in ids) {
          print
          break
        }
    }' ids.txt ids_processed=1 RS='\0' FS=/ -

然后，您仅处理文件列表一次，查找 100k id 只是在哈希表中查找，而不是执行最多 100k 正则表达式/通配符匹配。

Answer

这是错误的方法，如果重点是查找名称具有这些 ID 之一作为其空格分隔单词之一的所有文件，那么您可以这样做：

find /dir -type f -print0 |
  gawk '
    !ids_processed {ids[$0]; next}
    {
      n = split(tolower($NF), words, " ")
      for (i = 1; i <= n; i++)
        if (words[i] in ids) {
          print
          break
        }
    }' ids.txt ids_processed=1 RS='\0' FS=/ -

然后，您仅处理文件列表一次，查找 100k id 只是在哈希表中查找，而不是执行最多 100k 正则表达式/通配符匹配。

Question 2

我会做什么：

编写一个脚本将所有文件名保存到临时文件中：

# maybe run this from cron or behind inotifywait
find dir -type f -print > /tmp/filelist

然后根据需要使用输入文件进行查找：

fgrep -if hexids /tmp/filelist

我可能建议使用-wif代替，-if但从其他评论来看，尚不清楚您在问题中提供了准确的信息。man grep了解更多信息。

Answer