基于制表符分隔的文件特定列的非字典顺序排序

基于制表符分隔的文件特定列的非字典顺序排序

file.txt是制表符分隔的:

RollNo  Names    Class  Subject  Position
101     Anna     V      Maths    Average
102     Bob      V      Maths    Good
103     Charles  VI     Science  Good
104     Darwin   VI     Science  Improve
105     Eva      VII    English  Improve

我想对这些行进行排序,以便它们按Good, Average, 的顺序出现Improve

RollNo  Names    Class  Subject  Position
102     Bob      V      Maths    Good
103     Charles  VI     Science  Good
101     Anna     V      Maths    Average
104     Darwin   VI     Science  Improve
105     Eva      VII    English  Improve

答案1

鉴于file.txt

RollNo  Names    Class  Subject  Position
101     Anna     V      Maths    Average
102     Bob      V      Maths    Good
103     Charles  VI     Science  Good
104     Darwin   VI     Science  Improve
105     Eva      VII    English  Improve

用数字替换该行的最后一个单词。使用这些数字进行排序。然后将这些替换为原来的单词:

$ sed -e 's/Good$/1/' -e 's/Average$/2/' -e 's/Improve$/3/' file.txt | sort -k5n | sed -e 's/1$/Good/' -e 's/2$/Average/' -e 's/3$/Improve/'
RollNo  Names    Class  Subject  Position
102     Bob      V      Maths    Good
103     Charles  VI     Science  Good
101     Anna     V      Maths    Average
104     Darwin   VI     Science  Improve
105     Eva      VII    English  Improve

或者,根据该行的最后一个单词在每行前面添加一个数字,然后根据该数字进行排序。然后删除第一列:

$ awk 'NR==1 {n=0} $NF=="Good" {n=1} $NF=="Average" {n=2} $NF=="Improve" {n=3} { print n, $0 }' file.txt | sort -n | cut -d' ' -f2-
RollNo  Names    Class  Subject  Position
102     Bob      V      Maths    Good
103     Charles  VI     Science  Good
101     Anna     V      Maths    Average
104     Darwin   VI     Science  Improve
105     Eva      VII    English  Improve

答案2

这个片段:

# Utility functions: print-as-echo, print-line-with-visual-space.
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }

pl " Input data file $FILE:"
head $FILE

pl " Sort order file:"
head data2

pl " Expected output:"
head $E

pl " Results:"
msort -q -Z -l -n 5,5 -s data2 -c lexicographic $FILE

产生:

-----
 Input data file data1:
RollNo  Names   Class   Subject Position
101     Anna    V       Maths   Average
102     Bob     V       Maths   Good
103     Charles VI      Science Good
104     Darwin  VI      Science Improve
105     Eva     VII     English Improve

-----
 Sort order file:
Good
Average
Improve

-----
 Expected output:
RollNo  Names   Class   Subject Position
102     Bob     V       Maths   Good
103     Charles VI      Science Good
101     Anna    V       Maths   Average
104     Darwin  VI      Science Improve
105     Eva     VII     English Improve

-----
 Results:
RollNo  Names   Class   Subject Position
102     Bob     V       Maths   Good
103     Charles VI      Science Good
101     Anna    V       Maths   Average
104     Darwin  VI      Science Improve
105     Eva     VII     English Improve

通过使用排序,在许多存储库中找到的替代排序代码。 GNU 排序速度较慢,但​​具有许多附加功能,使其在许多情况下都很有用。这里的选项是-Z(复制第一行),-l(一行是一条记录),-q(安静),-n(键字段位置),-s(排序文件,每行一个键),- c(比较型)。

这是在这样的系统上:

OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30

msort 的一些详细信息:

msort   sort records in complex ways (man)
Path    : /usr/bin/msort
Version : 8.53
Type    : ELF64-bitLSBexecutable,x86-64,version1(SYSV ...)
Help    : probably available with -h,--help
Repo    : Debian 8.9 (jessie) 
Home    : http://www.billposer.org/Software/msort.html (pm)

最美好的祝愿...干杯,drl

相关内容