索引文件的元数据

索引文件的元数据

是否有一个工具可以根据文件索引(搜索)文件元数据在Linux上?我搜索了一下并发现(这里)有几种允许在 Linux 上建立文件索引的工具:

但似乎这些索引文件的元数据都没有(或者也许我只是在文档中没有找到方法)。可能看起来是一个答案,但文档很短,我无法确定这是否真的是我正在寻找的。

我找到了这个:G工作空间,它看起来不太合法,但它声明它确实允许基于元数据的索引:

GWMetadata 是元数据索引和搜索系统,其中包括活动文件夹功能。

我的目标

更准确地说,我想模仿 macO 上可以(本地)完成的操作:

# Create a file and set metadata
touch test_file.txt
xattr -w com.apple.metadata:MY_META todo test_file.txt
# Wait a few seconds for the index to update
sleep 5
# Search files that have a given metadata
mdfind "MY_META=todo"

它返回我们刚刚创建的文件的路径:$(pwd)/test_file.txt.我正在寻找一个具有 CLI 界面的工具,因为我需要围绕这几个命令实现自动化。

答案1

我不知道 Linux(Debian 风格)或 BSD 主机上有类似的设施,但在xattr启用的系统上,滚动您自己的设施并不太复杂。我给出一个可行的解决方案在这个答案的最后以 POSIX 兼容脚本的形式寻找(不是为了索引本身)。我集成了 CLI 使用帮助功能,因此进入使用它的阶段不需要仔细研究代码的每个细节,尽管“阅读代码”总体上是最好的建议。

索引需要不加修改地运行脚本,但作为守护进程(也许作为systemd服务单元)运行,并更新一个小型数据库(这意味着将相关数据保存在某种表中)。然而,持续这样做会给索引的 FS 主机带来很大的负担,例如 KDE 上的 Baloo 的情况。如果没有令人信服的理由这样做,我会完全避免它并坚持使用更轻量级的 CLI 点搜索实用程序,例如此处提出的实用程序。


脚本说明findxattr

  • 基于众所周知的find外部实用程序及其惯用语:
    find [options, filters ...] -exec sh -c '[...]' sh_exec "{}" \; 2> dev/null

  • 即使在 shebang#!/usr/bin/sh指向假 Bourne shell 的主机上,也能以 POSIX 兼容模式运行,这意味着您经常会发现
    /usr/bin/sh --> /usr/bin/bash.

  • 由一个 - 包装器组成find。它没有实现实现中常见的所有附加功能find,但它符合 OP 的要求,并为有兴趣改进搜索的 ClI 用户提供更多的灵活性。特别是,它实现了:

    • “帮助”模式,通过发出 来findxattr -h|--help记录正确的用法,
    • 长选项和短选项,前面分别有一个和两个连字符,
    • 两个过滤器-p|--path-m|--md|--maxdepth,非常类似于
      find [-maxdepth <d>] [-path <path_specs>] ...(参见man find
    • 全局搜索$PWD(当不带参数或使用 调用时-a|--all),
    • 通过x属性的KV属性进行标记搜索,即:
      • 钥匙 -x|--xattr <xattr_name>==
      • 价值-x|--xattr ==<xattr_value>
      • 两个都 -x|--xattr <xattr_name>==<xattr_value>

    其中,如果充分引用,x 属性的名称和值可以包含空格。

  • 可以简单地扩展以包含更多选项和开关getopt(在脚本中)将很乐意解析。要利用其他选项和过滤器,需要扩展逻辑,但处理现有选项的方式是扩展工具范围的蓝图。


脚本测试
首先构建一些玩具 x 属性。

$ touch ~/fully/qualified/foo ~/fully/qualified/_1_path/bar ~/fully/qualified/_2_path/foobar ~/fully/qualified/baz

$ setfattr -n user.likes -v 'I like strawberries.' ~/fully/qualified/foo

在上面,setfattr,user.__指向名为 的 x 属性的用户命名空间likes。新的 x 属性被定义为键值 (KV) 对。这里的键是字符串likes, dislikesfilebirth它们的值必须以 开头-v

默认情况下,使用attr代替setfattr(并替换-v-V),x 属性始终在用户命名空间中创建:

$ cd ~/fully/qualified
$ attr -s dislikes -V 'hacks' ./foo
$ attr -s filebirth -V '20220627-193029-CEST' ./_1_path/bar
$ attr -s filebirth -V '20210330-185430-CEST' ./_2_path/foobar
$ attr -s dislikes -V 'java' ./baz

,得到,消除或者列表扩展属性,您还可以attr在兼容的文件系统对象上使用(请参阅man attr):

用法:
attr [-LRSq] -s attrname [-V attrvalue] pathname # 设置值
attr [-LRSq] -g attrname pathname # 获取值
attr [-LRSq] -r attrname pathname # 删除 attr
attr [-LRq ] ] -l 路径名 # 列出属性
-s 从 stdin 读取一个值, -g 将一个值写入 stdout

因此,例如:

$ cd ~/fully/qualified

$ attr -qg likes ./foo
I like strawberries.
$ attr -qg filebirth ./_1_path/bar
20220627-193029-CEST

该脚本生成一个制表符分隔的输出,例如:

$ cd ~; pwd
/home/USER

$ findxattr -m 4                             # search entire subtree (from `$PWD`) with max depth of 4    
./fully/qualified/foo        likes        I like strawberries.
./fully/qualified/foo        dislikes     hacks
./fully/qualified/_1_path/bar        filebirth        20220627-193029-CEST
./fully/qualified/_2_path/foobar        filebirth        20210330-185430-CEST
./fully/qualified/baz        dislikes     java

$ findxattr -m 3 -x dislikes==               # search entire subtree (from `$PWD`) by name, depth
./fully/qualified/foo        dislikes     hacks
./fully/qualified/baz        dislikes     java

$ findxattr -m 4 -p "*lified/_2_*" -x filebirth== # search by depth, path, name
./fully/qualified/_2_path/foobar        filebirth        20210330-185430-CEST

$ findxattr -m 4 -x =='20220627-193029-CEST' # search entire subtree (from `$PWD`) by value, depth
./fully/qualified/_1_path/bar        filebirth        20220627-193029-CEST

$ findxattr -x filebirth==                   # search entire subtree (from `$PWD`) by name
./fully/qualified/_1_path/bar        filebirth        20220627-193029-CEST
./fully/qualified/_2_path/foobar     filebirth        20210330-185430-CEST

代码

#!/usr/bin/sh

#-------------------------------------------------------------
# Author: CBhihe
# Copyright (c) 2022 C. K. Bhihe
# Available under GPL3 at github.com/Cbhihe
#-------------------------------------------------------------

version='0.5.0'

set -o posix
#set -x

getopt_parsed=$(getopt -q -s sh -a -l 'all,help,path:,xattr:,maxdepth:' -o '+ham:p:x:' -- "$@")
exit_code=$?
if [ $exit_code -ne 0 ]; then
    printf "getopt exited with code %d (%s).\n%s\n" $exit_code "getopt parsing error"\
    "The most probable cause is an unknown option. Review command usage with \`findxattr -h|--help'" >&2
    exit 1
fi

eval set -- "$getopt_parsed"
unset getopt_parsed

xattrkey=""
xattrval=""

while true; do
    case "$1" in
        '-a'|'--all')
            /usr/bin/find . -exec sh -c '
                while IFS= read -r xattrkey; do
                    xattrvalout=`/usr/bin/attr -qg "$xattrkey" "$1" 2>/dev/null`
                    printf "%-20s\t%-20s\t%s\n" "$1" "$xattrkey" "$xattrvalout"
                done < <(/usr/bin/attr -ql "$1" 2>/dev/null)
            ' sh_exec "{}" \; 2>/dev/null
            exit 0
        ;;

        '-m'|'--maxdepth'|'--md'|'--maxd')
            tmp=`expr "$2" + 0`
            if (( "$2" == "$tmp" )); then
                maxdepth="$2"
                shift 2
                continue
            else
                printf "The script exited because '--maxdepth' arg must be a positive integer; current arg is %s.\n%s\n" "$2" "Review command usage with \`findxattr -h|--help'" >&2
                exit 1
            fi
        ;;

        '-p'|'--path')
            locpath="$2"
            #if [ "$locpath" = "\(/*[^/]\+\)\+/$" ]; then
            #if expr "$locpath" : "^\(/*\([^/]\)\+\)\+$" >/dev/null; then
            if expr "$locpath" : "^\(/*[^/]\+\)\+$" >/dev/null; then
                shift 2
                continue
            else
                printf "The script exited because '--path' arg must be a valid path; current arg is %s.\n%s\n" "$2" "Review command usage with \`findxattr -h|--help'" >&2
                exit 1
            fi
        ;;

        '-x'|'--xattr')
            keyval="$2"
            found=1
            if [ "$keyval" != "${keyval%==*}" ]; then
                found=0
            fi

            if [ "$found" = "1" ]; then
                printf "The script exited because '-x|--xattr' arg appears to be either empty or malformed.\nReview command usage with \`findxattr -h|--help'. Remember that extended attributes\ncan be sought by key AND value (<key>==<value>), or only by key (<key>==), or only by value\n(==<value>), where in each case the parenthesized content represents the '-x' option's argument.\n" >&2
                exit 1
            else
                xattrkey="${keyval%==*}"
                xattrval="${keyval#*==}"
                shift 2
                continue
            fi
        ;;

        '-h'|'--help')
            printf "%s\n" " " "This is a script based on \`find' but restricted to the options shown below. Both short- and long-" \
"format options are allowed. Unknown options cause the script to abort with exit code 1." \
" " \
"Usage:" \
"   \`findattr -h|--help' Prints this usage information. This option is used alone." \
"   \`findattr -a|--all'  Searches recursively for all files with xattr(s) starting at \$PWD." \
"                        This option is used alone." \
"   \`findattr [-m|maxdepth <d>] [-p|path <path>] [-x|xattr <xattr_name>==<xattr_value>]'" \
"       Options that can be combined:" \
"         -m|--maxdepth  Identical to \`find -maxdepth <d>' option, where \`d' a positive integer;" \
"                        Limits any recursive search with \`find', to the specified level of the file tree." \
"                        Note that the level of the file tree is understood counting from \$PWD, NOT" \
"                        from a supposed start point represented by the \`--path' argument if present." \
"         -p|--path      Identical to \`find -path <spath>' option;" \
"                        Traverse the file tree from the specified path, searching for ANY xattr," \
"                        unless the \`--xattr' option is invoked to filter the search so a specific xattr" \
"                        name and value can be sought." \
"         -x|--xattr     Lists files with specified \`xattr', n the file tree starting at \$PWD unless" \
"                        \`--path' is invoked." \
"                        A compulsory argument of the form: '<xattr_name>==<xattr_value>' is expected." \
"                        Quoting is needed in case the argument contains space(s) or special characters." \
" "
            exit 0
        ;;

        '--')
            shift
            break
        ;;

        *)
            printf "%s\n" "Internal error. Abort." >&2
            exit 1
        ;;

    esac
done

if [ -n "$maxdepth"  ] && [ -n "$locpath" ]; then
    set -- -maxdepth "$maxdepth" -path "$locpath"
elif [ -z "$maxdepth"  ] && [ -n "$locpath" ]; then
    set -- -path "$locpath"
elif [ -z "$maxdepth"  ] && [ -z "$locpath" ]; then
    set --
else
    #[ -n "$maxdepth"  ] && [ -z "$locpath" ]
    set -- -maxdepth "$maxdepth"
fi

if [ -n "$xattrkey" ] && [ -n "$xattrval" ]; then
#if expr "$xattrkey" != "" >/dev/null && expr "$xattrval" != "" >/dev/null; then
    xattrkey="$xattrkey" xattrval="$xattrval" /usr/bin/find . "$@" -exec sh -c '
        xattrvalout=`/usr/bin/attr -qg "$xattrkey" "$1" 2>/dev/null`
        if [ "$xattrvalout" = "$xattrval" ]; then
        #if expr "$xattrvalout" = "$xattrval" >/dev/null; then
            printf "%-20s\t%-20s\t%s\n" "$1" "$xattrkey" "$xattrvalout"
        fi
    ' sh_exec "{}" \; 2>/dev/null

elif [ -n "$xattrkey" ] && [ -z "$xattrval" ]; then
#elif expr "$xattrkey" != "" >/dev/null && expr "$xattrval" = "" >/dev/null; then
    xattrkey="$xattrkey" xattrval="$xattrval" /usr/bin/find . "$@" -exec sh -c '
        xattrvalout=`/usr/bin/attr -qg "$xattrkey" "$1" 2>/dev/null`
        if [ -n "$xattrvalout" ]; then
            while IFS= read -r xattrvalout || [ -n "$xattrvalout" ]; do
                printf "%-20s\t%-20s\t%s\n" "$1" "$xattrkey" "$xattrvalout"
            done <<<"$xattrvalout"
        fi
    ' sh_exec "{}" \; 2>/dev/null

elif [ -z "$xattrkey" ] && [ -z "$xattrval" ]; then
#elif expr "$xattrkey" = "" >/dev/null && expr "$xattrval" = "" >/dev/null; then
    /usr/bin/find . "$@" -exec sh -c '
        xattrkeys=`/usr/bin/attr -ql "$1" 2>/dev/null`
        if [ -n "$xattrkeys" ]; then
            while IFS= read -r xattrkey || [ -n "$xattrkey" ]; do
                xattrvalouts=`/usr/bin/attr -qg "$xattrkey" "$1" 2>/dev/null`
                if [ -n "$xattrvalouts" ]; then
                    while IFS= read -r xattrvalout || [ -n "$xattrvalout" ]; do
                        printf "%-20s\t%-20s\t%s\n" "$1" "$xattrkey" "$xattrvalout"
                    done <<<"$xattrvalouts"
                fi
            done <<<"$xattrkeys"
        fi
    ' sh_exec "{}" \; 2>/dev/null

else
    # [ -z "$xattrkey" ] && [ -n "$xattrval" ]; then
    # expr "$xattrkey" = "" >/dev/null && expr "$xattrval" != "" >/dev/null
    xattrkey="$xattrkey" xattrval="$xattrval" /usr/bin/find . "$@" -exec sh -c '
        xattrkeys=`/usr/bin/attr -ql "$1" 2>/dev/null`
        if [ -n "$xattrkeys" ]; then
            while IFS= read -r xattrkey || [ -n "$xattrkey" ]; do
                xattrvalouts=`/usr/bin/attr -qg "$xattrkey" "$1" 2>/dev/null`
                if [ -n "$xattrvalouts" ]; then
                    while IFS= read -r xattrvalout || [ -n "$xattrvalout" ]; do
                        if [ "$xattrvalout" = "$xattrval" ]; then
                            printf "%-20s\t%-20s\t%s\n" "$1" "$xattrkey" "$xattrvalout"
                        fi
                    done <<<"$xattrvalouts"
                fi
            done <<<"$xattrkeys"
        fi
    ' sh_exec "{}" \; 2>/dev/null

fi

exit 0

相关内容