使用 awk 打印新行

Question 1

awk 默认只运行一次文件，按顺序运行所有块，这就是它给出输出的原因。您可以使用以下方式获得所需的行为数组随时保存行，同时仍然只处理文件一次：

BEGIN {
    AgeIndex = 1
    HeightIndex = 1
}
/Age/ {
    ages[AgeIndex] = $0
    AgeIndex+=1
}
/Height/ {
    heights[HeightIndex] = $0
    HeightIndex+=1
}
END {
    for (x = 1; x < AgeIndex; x++)
        print ages[x] "\n"
    for (x = 1; x < HeightIndex; x++)
        print heights[x] "\n"
}

将其保存到，filter.awk然后运行：

awk -f filter.awk output.txt > output2.txt

得到你想要的输出：

$ awk -f filter.awk < data
Age 1

Age 2

Height 1

Height 2

我们正在做的是创建两个数组ages，heights并将每个匹配行保存到其中。AgeIndex保存我们到达数组的距离。最后，我们打印出我们保存的每一行（以及您想要的额外换行符），首先是所有年龄，然后是所有身高。

数组最后会将整个文件保存在内存中，因此，如果您的文件特别大，您必须权衡内存使用量，以换取多次遍历整个文件的时间。此时它本质上与任何其他语言的程序相同 - 如果您没有任何特殊原因使用 awk，您可能更喜欢其他语言。说实话，我想我会建议 - awk 在这里并没有给你带来太多好处。

Answer

awk 默认只运行一次文件，按顺序运行所有块，这就是它给出输出的原因。您可以使用以下方式获得所需的行为数组随时保存行，同时仍然只处理文件一次：

BEGIN {
    AgeIndex = 1
    HeightIndex = 1
}
/Age/ {
    ages[AgeIndex] = $0
    AgeIndex+=1
}
/Height/ {
    heights[HeightIndex] = $0
    HeightIndex+=1
}
END {
    for (x = 1; x < AgeIndex; x++)
        print ages[x] "\n"
    for (x = 1; x < HeightIndex; x++)
        print heights[x] "\n"
}

将其保存到，filter.awk然后运行：

awk -f filter.awk output.txt > output2.txt

得到你想要的输出：

$ awk -f filter.awk < data
Age 1

Age 2

Height 1

Height 2

我们正在做的是创建两个数组ages，heights并将每个匹配行保存到其中。AgeIndex保存我们到达数组的距离。最后，我们打印出我们保存的每一行（以及您想要的额外换行符），首先是所有年龄，然后是所有身高。

数组最后会将整个文件保存在内存中，因此，如果您的文件特别大，您必须权衡内存使用量，以换取多次遍历整个文件的时间。此时它本质上与任何其他语言的程序相同 - 如果您没有任何特殊原因使用 awk，您可能更喜欢其他语言。说实话，我想我会建议 - awk 在这里并没有给你带来太多好处。

Question 2

和gawk：

$ awk -F"\t" '
    { a[$1]++ }
    END {
        n = asorti(a,b);
        for (i = 1; i <= n; i++) {
            print b[i];
            if (i%2 == 0) {
                printf "\n";
            }
        }
    }
' output.txt
Age 1
Age 2

Height 1
Height 2

Weight 1
Weight 2

Answer

和gawk：

$ awk -F"\t" '
    { a[$1]++ }
    END {
        n = asorti(a,b);
        for (i = 1; i <= n; i++) {
            print b[i];
            if (i%2 == 0) {
                printf "\n";
            }
        }
    }
' output.txt
Age 1
Age 2

Height 1
Height 2

Weight 1
Weight 2

Question 3

我认为空行不是您实际文件的一部分，或者至少您不关心它们。如果是这样，您所需要的只是sort：

$ cat output.txt
Age 1
Height 1
Weight 1
Age 2
Height 2
Weight 2

$ sort output.txt
Age 1
Age 2
Height 1
Height 2
Weight 1
Weight 2

但是，除非您的文件太大而无法保存在内存中，否则一步完成整个操作可能会更简单：

grep -whE 'Age|Height|Weight' *txt | sort > outfile

上面的命令将在当前目录 ( ) 中搜索名称以或结尾Age的所有文件。意思是“仅匹配整个单词”（例如，不匹配），这是需要的，因为如果没有它，当给出多个输入文件时，文件名将与匹配行一起打印。它启用了扩展正则表达式，为我们提供了 OR。HeightWeighttxt*txt-wAgeAgeing-h-E|

笔记： 如果由于某种原因，您确实想要每个条目之间有额外的空行（这不是您的grep命令会产生的），您可以使用以下命令添加它：

grep -whE 'Age|Height|Weight' *txt | sort | sed 's/$/\n/'

例子

$ for i in {1..3}; do echo -e "Name $i\nAge $i\nHeight $i\nWeight $i" > $i.txt; done
$ for f in *txt; do echo " -- $f --"; cat $f; done
 -- 1.txt --
Name 1
Age 1
Height 1
Weight 1
 -- 2.txt --
Name 2
Age 2
Height 2
Weight 2
 -- 3.txt --
Name 3
Age 3
Height 3
Weight 3

$ grep -whE 'Age|Height|Weight' *txt | sort
Age 1
Age 2
Age 3
Height 1
Height 2
Height 3
Weight 1
Weight 2
Weight 3

无论如何，即使sort不会为你削减它，我也会在 Perl 中做这种事情，而不是awk（这是假设你想要额外的空行，但你可能不想要）：

$ perl -ane '$k{$F[0]}.=$_."\n" if /./; 
    END{print $k{$_},"\n" for sort keys (%k)}' output.txt 
Age 1

Age 2


Height 1

Height 2


Weight 1

Weight 2

如果您不需要的话，可以通过它head -n -2来删除最后两行空行。

Answer