在 Linux 中确定文件的用户视角“类型”的最彻底方法？

Question

我永远不会接受这个答案，我只是想我应该分享我目前正在使用的临时令人厌恶的东西。它很慢而且很粗糙，但似乎产生了非常有用的结果。我喜欢对于零长度文件，它会生成“空”文件类型；我还没有见过任何文件管理器可以做到这一点。

由于它是一个快速原型，因此采用 xonsh 脚本的形式。如果您不熟悉 xonsh，在这种情况下，您可以假设它是 python，具有非常方便的语法$(blah bloo)来捕获 command 的输出blah bloo。

#!/usr/bin/env xonsh
import re

filesez = $(file -b $ARG1)
filesez = filesez.strip()

# file is very good at detecting scripts:
script = re.search(r'\w*\s*script', filesez)
if script:
   print(script.group())
   exit(0)

# If file said anything complicated, we'll try mimetype
punctuation = re.search(r'[,;]', filesez)
# For text files, assume the extension is right
trust = (filesez == 'ASCII text')
if not (punctuation or trust):
   print(filesez)
   exit(0)

# OK, didn't get enough info from file, try mimetype
# Use a format that gives both the mime type and
# the description at once, to save calls
oformat = r'--output-format=%m|%d'
mimewith = $(mimetype @(oformat) $ARG1)
[mimetypewith, descwith] = mimewith.strip().split('|')

# Usually we go with the extension
desc = descwith
# And we always do with text files, as mentioned above
trust = (trust or 'Unicode text' in filesez)
if not trust:
   mimewithout = $(mimetype -M @(oformat) $ARG1)
   [mimetypewithout, descwithout] = mimewithout.strip().split('|')
   if mimetypewith.split('/')[0] == mimetypewithout.split('/')[0]:
      # If the types are compatible, believe the extension
      desc = descwith
   else:
      desc = descwithout

# Calling everything a document is a waste of screenspace:
if desc.endswith(' document'):
   desc = desc[0:-9]

# All right, we've done the best we can:
print(desc)

如果您有更好的答案，请发布；我真的很想清理掉这个。

Answer 1

我永远不会接受这个答案，我只是想我应该分享我目前正在使用的临时令人厌恶的东西。它很慢而且很粗糙，但似乎产生了非常有用的结果。我喜欢对于零长度文件，它会生成“空”文件类型；我还没有见过任何文件管理器可以做到这一点。

由于它是一个快速原型，因此采用 xonsh 脚本的形式。如果您不熟悉 xonsh，在这种情况下，您可以假设它是 python，具有非常方便的语法$(blah bloo)来捕获 command 的输出blah bloo。

#!/usr/bin/env xonsh
import re

filesez = $(file -b $ARG1)
filesez = filesez.strip()

# file is very good at detecting scripts:
script = re.search(r'\w*\s*script', filesez)
if script:
   print(script.group())
   exit(0)

# If file said anything complicated, we'll try mimetype
punctuation = re.search(r'[,;]', filesez)
# For text files, assume the extension is right
trust = (filesez == 'ASCII text')
if not (punctuation or trust):
   print(filesez)
   exit(0)

# OK, didn't get enough info from file, try mimetype
# Use a format that gives both the mime type and
# the description at once, to save calls
oformat = r'--output-format=%m|%d'
mimewith = $(mimetype @(oformat) $ARG1)
[mimetypewith, descwith] = mimewith.strip().split('|')

# Usually we go with the extension
desc = descwith
# And we always do with text files, as mentioned above
trust = (trust or 'Unicode text' in filesez)
if not trust:
   mimewithout = $(mimetype -M @(oformat) $ARG1)
   [mimetypewithout, descwithout] = mimewithout.strip().split('|')
   if mimetypewith.split('/')[0] == mimetypewithout.split('/')[0]:
      # If the types are compatible, believe the extension
      desc = descwith
   else:
      desc = descwithout

# Calling everything a document is a waste of screenspace:
if desc.endswith(' document'):
   desc = desc[0:-9]

# All right, we've done the best we can:
print(desc)

如果您有更好的答案，请发布；我真的很想清理掉这个。

在 Linux 中确定文件的用户视角“类型”的最彻底方法？

答案1

相关内容