linux + 根据机器编号重新排序文件中的行

linux + 根据机器编号重新排序文件中的行

我们有以下文件

    more /home/list.in

    master01.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com AMBARI_METRICS STARTED
    worker01.fsdns.com AMBARI_METRICS STARTED
    worker02.fsdns.com AMBARI_METRICS STARTED
    worker03.fsdns.com AMBARI_METRICS STARTED
    worker05.fsdns.com AMBARI_METRICS STARTED
    worker06.fsdns.com AMBARI_METRICS STARTED
    worker07.fsdns.com AMBARI_METRICS STARTED
    worker08.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com AMBARI_METRICS STARTED

    master01.fsdns.com YARN STARTED
    master02.fsdns.com YARN STARTED
    master03.fsdns.com YARN STARTED
    worker01.fsdns.com YARN STARTED
    worker02.fsdns.com YARN STARTED
    worker03.fsdns.com YARN STARTED
    worker05.fsdns.com YARN STARTED
    worker06.fsdns.com YARN STARTED
    worker07.fsdns.com YARN STARTED
    worker08.fsdns.com YARN STARTED
    worker09.fsdns.com YARN STARTED

    master01.fsdns.com HDFS STARTED
    master02.fsdns.com HDFS STARTED
    master03.fsdns.com HDFS STARTED
    worker01.fsdns.com HDFS STARTED
    worker02.fsdns.com HDFS STARTED
    worker03.fsdns.com HDFS STARTED
    worker05.fsdns.com HDFS STARTED
    worker06.fsdns.com HDFS STARTED
    worker07.fsdns.com HDFS STARTED
    worker08.fsdns.com HDFS STARTED
    worker09.fsdns.com HDFS STARTED

我们想要将文件 - list.in 重新排序为以下结构(预期结果)

因此与机器编号相关的所有行都将位于同一组中

预期成绩

    master01.fsdns.com AMBARI_METRICS STARTED
    master01.fsdns.com YARN STARTED
    master01.fsdns.com HDFS  STARTED

    master02.fsdns.com AMBARI_METRICS STARTED
    master02.fsdns.com YARN STARTED
    master02.fsdns.com HDFS STARTED

    master03.fsdns.com AMBARI_METRICS STARTED
    master03.fsdns.com YARN STARTED
    master03.fsdns.com HDFS STARTED
    .
    .
    .
    .
    . 
    worker09.fsdns.com AMBARI_METRICS STARTED
    worker09.fsdns.com YARN STARTED
    worker09.fsdns.com HDFS STARTED

到目前为止我已经尝试过什么

 for i in 01 02 03 04 05 06 07 
 do
  grep  worker$i /tmp/list.in
 done


 worker01.fsdns.com AMBARI_METRICS STARTED
 worker01.fsdns.com YARN STARTED
 worker01.fsdns.com HDFS STARTED
 worker02.fsdns.com AMBARI_METRICS STARTED
 worker02.fsdns.com YARN STARTED
 worker02.fsdns.com HDFS STARTED
 worker03.fsdns.com AMBARI_METRICS STARTED
 worker03.fsdns.com YARN STARTED
 worker03.fsdns.com HDFS STARTED
 worker05.fsdns.com AMBARI_METRICS STARTED
 worker05.fsdns.com YARN STARTED
 worker05.fsdns.com HDFS STARTED
 worker06.fsdns.com AMBARI_METRICS STARTED
 worker06.fsdns.com YARN STARTED
 worker06.fsdns.com HDFS STARTED
 worker07.fsdns.com AMBARI_METRICS STARTED
 worker07.fsdns.com YARN STARTED
 worker07.fsdns.com HDFS STARTED

答案1

如果空行对您来说不重要,一个简单的排序命令可能是:

sort -t. -k1 /home/list.in

结果(带有前导空行):

master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

答案2

$ sort -k1,1 list.in  | 
    awk '
      /^[[:space:]]*$/ { next };
      lasthost == "" { lasthost = $1 };
      $1 == lasthost { print $0; next };
      {print "\n" $0 ; lasthost=$1 }' 
master01.fsdns.com AMBARI_METRICS STARTED
master01.fsdns.com HDFS STARTED
master01.fsdns.com YARN STARTED

master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com HDFS STARTED
master02.fsdns.com YARN STARTED

master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com HDFS STARTED
master03.fsdns.com YARN STARTED

worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com HDFS STARTED
worker01.fsdns.com YARN STARTED

worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com HDFS STARTED
worker02.fsdns.com YARN STARTED

worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com HDFS STARTED
worker03.fsdns.com YARN STARTED

worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com HDFS STARTED
worker05.fsdns.com YARN STARTED

worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com HDFS STARTED
worker06.fsdns.com YARN STARTED

worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com HDFS STARTED
worker07.fsdns.com YARN STARTED

worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com HDFS STARTED
worker08.fsdns.com YARN STARTED

worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com HDFS STARTED
worker09.fsdns.com YARN STARTED

awk 脚本跟踪字段 $1 中看到的最后一个主机名,并在当前输入行发生更改时在当前输入行之前打印换行符。它还会跳过任何完全空或仅包含空格字符的行。

为了避免在第一条记录之前打印空行,它还会检查lasthost变量是否为空(即未定义),如果是则设置它。

答案3

这有效:

awk '$1{a[$1];b[$2]}
END{asorti(a);for( i in a){for(j in b){printf("%s %s\n",a[i],j)};printf("\n")}}' file

$1对于第一个字段不为空,
{a[$1];b[$2]}创建数组 a 和 b
END{在读取所有文件后,对每个服务器的每台机器的
asorti(a)数组 a 进行排序,打印排序后的值,为输入文件 打印一个新(空)行。
for( i in a ){
for(j in b){
printf("%s %s\n",a[i],j)};
printf("\n")}
}' file

答案4

使用 awk 和 sed 来实现相同的目的。经测试,效果很好

i=`awk -F "." '{print $1}' l.txt  | sed '/^$/d' | sed  "s/\s+//g" | sort -u`; for j in $i; do sed -n "/$j/p" l.txt; done

输出

master01.fsdns.comAMBARI_METRICS STARTED
master01.fsdns.com YARN STARTED
master01.fsdns.com HDFS STARTED
master02.fsdns.com AMBARI_METRICS STARTED
master02.fsdns.com YARN STARTED
master02.fsdns.com HDFS STARTED
master03.fsdns.com AMBARI_METRICS STARTED
master03.fsdns.com YARN STARTED
master03.fsdns.com HDFS STARTED
worker01.fsdns.com AMBARI_METRICS STARTED
worker01.fsdns.com YARN STARTED
worker01.fsdns.com HDFS STARTED
worker02.fsdns.com AMBARI_METRICS STARTED
worker02.fsdns.com YARN STARTED
worker02.fsdns.com HDFS STARTED
worker03.fsdns.com AMBARI_METRICS STARTED
worker03.fsdns.com YARN STARTED
worker03.fsdns.com HDFS STARTED
worker05.fsdns.com AMBARI_METRICS STARTED
worker05.fsdns.com YARN STARTED
worker05.fsdns.com HDFS STARTED
worker06.fsdns.com AMBARI_METRICS STARTED
worker06.fsdns.com YARN STARTED
worker06.fsdns.com HDFS STARTED
worker07.fsdns.com AMBARI_METRICS STARTED
worker07.fsdns.com YARN STARTED
worker07.fsdns.com HDFS STARTED
worker08.fsdns.com AMBARI_METRICS STARTED
worker08.fsdns.com YARN STARTED
worker08.fsdns.com HDFS STARTED
worker09.fsdns.com AMBARI_METRICS STARTED
worker09.fsdns.com YARN STARTED
worker09.fsdns.com HDFS STARTED

相关内容