我们有以下文件
more /home/list.in
master01.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com HDFS STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com HDFS STARTED
worker08.fsdns.com HDFS STARTED
worker09.fsdns.com HDFS STARTED
我们想要将文件 - list.in 重新排序为以下结构(预期结果)
因此与机器编号相关的所有行都将位于同一组中
预期成绩
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com YARN STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com YARN STARTED
master03.fsdns.com HDFS STARTED
.
.
.
.
.
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com YARN STARTED
worker09.fsdns.com HDFS STARTED
到目前为止我已经尝试过什么
for i in 01 02 03 04 05 06 07
do
grep worker$i /tmp/list.in
done
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com YARN STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com YARN STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com YARN STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com YARN STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com YARN STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com YARN STARTED
worker07.fsdns.com HDFS STARTED
答案1
如果空行对您来说不重要,一个简单的排序命令可能是:
sort -t. -k1 /home/list.in
结果(带有前导空行):
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED
答案2
$ sort -k1,1 list.in |
awk '
/^[[:space:]]*$/ { next };
lasthost == "" { lasthost = $1 };
$1 == lasthost { print $0; next };
{print "\n" $0 ; lasthost=$1 }'
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED
awk 脚本跟踪字段 $1 中看到的最后一个主机名,并在当前输入行发生更改时在当前输入行之前打印换行符。它还会跳过任何完全空或仅包含空格字符的行。
为了避免在第一条记录之前打印空行,它还会检查lasthost
变量是否为空(即未定义),如果是则设置它。
答案3
这有效:
awk '$1{a[$1];b[$2]}
END{asorti(a);for( i in a){for(j in b){printf("%s %s\n",a[i],j)};printf("\n")}}' file
$1
对于第一个字段不为空,
{a[$1];b[$2]}
创建数组 a 和 b
END{
在读取所有文件后,对每个服务器的每台机器的
asorti(a)
数组 a 进行排序,打印排序后的值,为输入文件 打印一个新(空)行。
for( i in a ){
for(j in b){
printf("%s %s\n",a[i],j)};
printf("\n")}
}' file
答案4
使用 awk 和 sed 来实现相同的目的。经测试,效果很好
i=`awk -F "." '{print $1}' l.txt | sed '/^$/d' | sed "s/\s+//g" | sort -u`; for j in $i; do sed -n "/$j/p" l.txt; done
输出
master01.fsdns.comAMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com YARN STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com YARN STARTED
master03.fsdns.com HDFS STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com YARN STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com YARN STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com YARN STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com YARN STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com YARN STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com YARN STARTED
worker07.fsdns.com HDFS STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com YARN STARTED
worker08.fsdns.com HDFS STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com YARN STARTED
worker09.fsdns.com HDFS STARTED