我想排序
- 基于文件名。
- 对于文件名前缀匹配且文件以数字结尾的情况,我希望根据文件名末尾的数字对它们进行数字排序。
下列
cat /tmp/foo.txt | sort -t/ -k3,3 -k3,3n
完成 1,但未完成 2。
输入/tmp/foo.txt
dirA/catA/apple.txt
dirA/catA/addition.txt
dirA/catA/difference
dirA/catB/binary.txt
dirA/catB/carry.txt
dirA/catB/digit
dirA/catC/test-10.txt
dirA/catC/test-100.txt
dirA/catC/test-1000.txt
dirA/catC/test-11.txt
dirA/catC/test-2.txt
dirA/catC/test-20.txt
dirA/catC/test-25.txt
dirA/catC/test-5.txt
dirA/catC/test-50.txt
dirA/catC/test-500.txt
dirA/catC/test-7.txt
dirA/catC/test-75.txt
dirA/catC/test-8.txt
dirA/catC/abc-test-9.txt
dirA/catC/abc-test-999.txt
dirA/catC/abc-test-75.txt
dirA/catC/abc-test-8.txt
所需输出
dirA/catC/abc-test-8.txt
dirA/catC/abc-test-9.txt
dirA/catC/abc-test-75.txt
dirA/catC/abc-test-999.txt
dirA/catA/addition.txt
dirA/catA/apple.txt
dirA/catB/binary.txt
dirA/catB/carry.txt
dirA/catA/difference
dirA/catB/digit
dirA/catC/test-2.txt
dirA/catC/test-5.txt
dirA/catC/test-7.txt
dirA/catC/test-8.txt
dirA/catC/test-10.txt
dirA/catC/test-11.txt
dirA/catC/test-20.txt
dirA/catC/test-25.txt
dirA/catC/test-50.txt
dirA/catC/test-75.txt
dirA/catC/test-100.txt
dirA/catC/test-500.txt
dirA/catC/test-1000.txt
答案1
Perl 来救援!
perl -e '
print for sort { (($a =~ m{.*/([^0-9]*)})[0] cmp ($b =~ m{.*/([^0-9]*)})[0])
||
(($a =~ /-([0-9]+)/)[0] <=> ($b =~ /-([0-9]+)/)[0]) } <>
' -- /tmp/foo.txt
<>
读取输入行- 种类根据给定的代码对列表进行排序
m{.*/([^0-9]*)}
提取基本名称最多一个数字(如果存在)cmp
进行字符串比较- 如果它们相等,则
||
“或”应用第二个比较,其中: /-([0-9]+)/
提取数字<=>
进行数值比较- 需要该
(...)[0]
构造,因为匹配返回匹配列表(对应于$1
、$2
等)。需要列表上下文来获取匹配项。我们只对第一场比赛感兴趣(因为没有其他比赛)。
答案2
awk '
BEGIN {FS = "[-/.]"; OFS = "\t"}
{n = 0}
$(NF-1) ~ /^[0-9]+$/ {n = $(NF-1)}
{print $3, n, $0}
' foo.txt \
| sort -k1,1 -k2,2n \
| cut -f3-
这是一个施瓦茨变换:
- awk 程序将文件名的第一个单词和文件编号作为列放在文件路径之前
- 数据按名称排序,然后按数字排序
- 然后新列被删除。
输出
dirA/catC/abc-test-8.txt
dirA/catC/abc-test-9.txt
dirA/catC/abc-test-75.txt
dirA/catC/abc-test-999.txt
dirA/catA/addition.txt
dirA/catA/apple.txt
dirA/catB/binary.txt
dirA/catB/carry.txt
dirA/catA/difference
dirA/catB/digit
dirA/catC/test-2.txt
dirA/catC/test-5.txt
dirA/catC/test-7.txt
dirA/catC/test-8.txt
dirA/catC/test-10.txt
dirA/catC/test-11.txt
dirA/catC/test-20.txt
dirA/catC/test-25.txt
dirA/catC/test-50.txt
dirA/catC/test-75.txt
dirA/catC/test-100.txt
dirA/catC/test-500.txt
dirA/catC/test-1000.txt
与 Perl 单行代码相同的过程(除了您“自下而上”阅读 Perl 语句)
perl -e '
print join "",
map { $_->[2] }
sort { $a->[0] cmp $b->[0] || $a->[1] <=> $b->[1] }
map { [m{.*/(\D+)(\d*)}, $_] }
<>;
' foo.txt
答案3
使用 sed:
cat /tmp/foo.txt | sed "s/[[:alnum:]-]*\/[[:alnum:]-]*\/\([[:alpha:]-]*\)\([[:digit:]]*\).*/\0|\1|\2 /"|sort -t"|" -k2,2 -k3n|sed "s/\([^|]*\).*/\1/"
诀窍是暂时将所需的字段放在行尾。
哎呀:这样更好:
cat source | sed "s/[^/]*\/[^/]*\/\([^[:digit:]]*\)\([[:digit:]]*\).*/\0|\1|\2 /"|sort -t"|" -k2,2 -k3n|sed "s/\([^|]*\).*/\1/"
我对原来的问题做了一些改动。按姓氏排序,不带数字。
dirA/catC/abc-test-8.txt
dirA/catC/abc-test-9.txt
dirA/catC/abc-test-75.txt
dirA/catC/abc-test-999.txt
dirA/catA/addition.txt
dirA/catA/apple.txt
dirA/catB/binary.txt
dirA/catB/carry.txt
dirA/catA/difference
dirA/catB/digit
dirA/catC/test-2.txt
dirA/catC/test-5.txt
dirA/catC/test-7.txt
dirA/catC/test-8.txt
dirA/catC/test-10.txt
dirA/catC/subdir/test-11.txt
dirA/catC/test-11.txt
dirA/cat C/subdir/test-12.txt
dirA/catC/test-20.txt
dirA/catC/test-25.txt
dirA/catC/test-50.txt
dirA/catC/test-75.txt
dirA/catC/test-100.txt
dirA/catC/test-500.txt
dirA/catC/test-1000.txt
cat /tmp/foo.txt | sed "s/\([^/]*\/\)\+\([^[:digit:]]*\)\([[:digit:]]*\)\(.*\)/\0|\2\4|\3 /"|sort -t"|" -k2,2 -k3n|sed "s/\([^|]*\).*/\1/"
输出:
dirA/catC/abc-test-8.txt
dirA/catC/abc-test-9.txt
dirA/catC/abc-test-75.txt
dirA/catC/abc-test-999.txt
dirA/catA/addition.txt
dirA/catA/apple.txt
dirA/catB/binary.txt
dirA/catB/carry.txt
dirA/catA/difference
dirA/catB/digit
dirA/catC/test-2.txt
dirA/catC/test-5.txt
dirA/catC/test-7.txt
dirA/catC/test-8.txt
dirA/catC/test-10.txt
dirA/catC/subdir/test-11.txt
dirA/catC/test-11.txt
dirA/cat C/subdir/test-12.txt
dirA/catC/test-20.txt
dirA/catC/test-25.txt
dirA/catC/test-50.txt
dirA/catC/test-75.txt
dirA/catC/test-100.txt
dirA/catC/test-500.txt
dirA/catC/test-1000.txt
解释:
\([^/]*\/\)\+
切断整个路径。 =>\1
\([^[:digit:]]*\)
文件名部分不带数字 =>\2
\([[:digit:]]*\)
数字 =>3
\(.*\)
扩展名 =>4
\0|\2\4|\3
打印整行 |文件名第一部分和扩展名 |数字
sort -t"|" -k2,2 -k3n|sed "s/\([^|]*\).*/\1/
整理,并剪掉不需要的部分。
而不是最后一个 sedcut -d "|" -f1
也可以工作