大型目录中的最长文件名

Question 1

我猜@steeldriver 的解决方案是更好的选择，但这是我的替代解决方案，您可以使用命令组合来准确找到两个（或更多）最长的文件名。

find . | awk 'function base(f){sub(".*/", "", f); return f;} \
{print length(base($0)), $0}'| sort -nr | head -2

输出如下：

length ./path/to/file

这是一个真实的例子：

42 ./path/to/this-file-got-42-character-right-here.txt
31 ./path/to/this-file-got-31-character.txt

笔记

find给出了该目录中所有文件的列表，例如：

./path/to/this-file-got-31-character.txt

我们将awk文件长度添加到每行的开头（它实际上是文件长度而不是路径的长度）：

31 ./path/to/this-file-got-31-character.txt

最后我们根据文件长度对其进行排序并使用获取前两行head。

Answer

我猜@steeldriver 的解决方案是更好的选择，但这是我的替代解决方案，您可以使用命令组合来准确找到两个（或更多）最长的文件名。

find . | awk 'function base(f){sub(".*/", "", f); return f;} \
{print length(base($0)), $0}'| sort -nr | head -2

输出如下：

length ./path/to/file

这是一个真实的例子：

42 ./path/to/this-file-got-42-character-right-here.txt
31 ./path/to/this-file-got-31-character.txt

笔记

find给出了该目录中所有文件的列表，例如：

./path/to/this-file-got-31-character.txt

我们将awk文件长度添加到每行的开头（它实际上是文件长度而不是路径的长度）：

31 ./path/to/this-file-got-31-character.txt

最后我们根据文件长度对其进行排序并使用获取前两行head。

Question 2

根据评论，在这种情况下你真正需要的是所有名称长度超过最大字符数的文件的列表 - 幸运的是，使用find正则表达式相对容易：

find $PWD -regextype posix-extended -regex '.*[^/]{255,}$'

对于如此大量的文件和目录，您可能希望避免排序 - 相反，我们只需连续记录最长和第二长的文件名以及它们的完整路径名：

find $PWD -printf '%p\0' | awk -v RS='\0' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > l[1]) {
      # bump the current longest to 2nd place
      l[2] = l[1]; p[2] = p[1];
      # store the new 1st place length and pathname
      l[1] = currlen; p[1] = $0;
    }
    else if (currlen > l[2]) {
      # store the new 2st place length and pathname
      l[2] = currlen; p[2] = $0;
    }
  }

  END {
      for (i in l) printf "(%d) %d : %s\n", i, l[i], p[i];
  }'

或者使用 GNU awk（支持二维数组）

$ find $PWD -printf '%p\0' | gawk -v RS='\0' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > p[1][1]) {
      # bump the current longest to 2nd place
      p[2][1] = p[1][1]; p[2][2] = p[1][2];
      # store the new 1st place length and pathname
      p[1][1] = currlen; p[1][2] = $0;
    }
    else if (currlen > p[2][1]) {
      # store the new 2st place length and pathname
      p[2][1] = currlen; p[2][2] = $0;
    }
  }

  END {
      for (i in p[1]) printf "(%d) %d : %s\n", i, p[i][1], p[i][2];
  }'

Answer

根据评论，在这种情况下你真正需要的是所有名称长度超过最大字符数的文件的列表 - 幸运的是，使用find正则表达式相对容易：

find $PWD -regextype posix-extended -regex '.*[^/]{255,}$'

对于如此大量的文件和目录，您可能希望避免排序 - 相反，我们只需连续记录最长和第二长的文件名以及它们的完整路径名：

find $PWD -printf '%p\0' | awk -v RS='\0' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > l[1]) {
      # bump the current longest to 2nd place
      l[2] = l[1]; p[2] = p[1];
      # store the new 1st place length and pathname
      l[1] = currlen; p[1] = $0;
    }
    else if (currlen > l[2]) {
      # store the new 2st place length and pathname
      l[2] = currlen; p[2] = $0;
    }
  }

  END {
      for (i in l) printf "(%d) %d : %s\n", i, l[i], p[i];
  }'

或者使用 GNU awk（支持二维数组）

$ find $PWD -printf '%p\0' | gawk -v RS='\0' '
  {
    # get the length of the basename of the current filepath
    n = split($0,a,"/");
    currlen = length(a[n]);

    if (currlen > p[1][1]) {
      # bump the current longest to 2nd place
      p[2][1] = p[1][1]; p[2][2] = p[1][2];
      # store the new 1st place length and pathname
      p[1][1] = currlen; p[1][2] = $0;
    }
    else if (currlen > p[2][1]) {
      # store the new 2st place length and pathname
      p[2][1] = currlen; p[2][2] = $0;
    }
  }

  END {
      for (i in p[1]) printf "(%d) %d : %s\n", i, p[i][1], p[i][2];
  }'

大型目录中的最长文件名

答案1

笔记

答案2

相关内容