我有一个有几百行的表:
a1
a2
a3
a4
b1
b2
b3
b4
c1
c2
c3
c4
... etc.
我想按以下顺序退货:
a1
b1
c1
d1
a2
b2
c2
d2
a3
b3
c3
下面的脚本用于选择第一行块:
$ awk '{if(NR==1||NR%4==1)print}'
但是我怎样才能循环它来对整个文件执行此操作呢?
答案1
您可以使用sort
它来进行排序。具体来说,您可以告诉sort
进行一般排序,g
它处理字母和数字的排序。我们可以通过sort
使用X.Y
符号而不是更典型的符号来控制要对字符串中的哪个字符进行排序X,Y
。
例如:
$ sort -k1.2g file
a1
b1
c1
a2
b2
c2
a3
b3
c3
a4
b4
c4
排序选项:
-k, --key=KEYDEF
sort via a key; KEYDEF gives location and type
-g, --general-numeric-sort
compare according to general numerical value
KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is
a field number and C a character position in the field; both are origin 1,
and the stop position defaults to the line's end. If neither -t nor -b is
in effect, characters in a field are counted from the beginning of the
preceding whitespace. OPTS is one or more single-letter ordering options
[bdfgiMhnRrV], which override global ordering options for that key. If
no key is given, use the entire line as the key.
答案2
如果“步长”总是很小(在您的情况下为 4),那么一种快速而肮脏的方法可能是简单地多次读取文件并在每个偏移量处挑选出记录 - 例如
awk 'FNR==1 {k++} !((FNR-k)%4)' file file file file
a1
b1
c1
a2
b2
c2
a3
b3
c3
a4
b4
c4
或者,等效地,使用 GNU Awk(及其BEGINFILE
规则):
gawk 'BEGINFILE{k++} !((FNR-k)%4)' file file file file