多列对齐

多列对齐

我的原始数据为 -

   id=ABC name=Banana DB Connection type=FruitMarket
    XYZ_1 ABC.xml
    XYZ_2 ABC.xml
    XYZ_3 ABC.xml
    "Fruits/Mango/#Common"
    "Fruits/Mango/#Bizzare"
    "Fruits/Mango/#Common"

    id=EFG name=FruitHouse type=jms
    XYZ_4 EFG.xml
    "Fruits/Plum Orange"

    id=JKL name=JMSWriteConnect type=jms
    XYZ_4 JKL.xml
    "Fruits/Plum Orange"

    id=TMZ name=Banana DB Connection type=FruitMarket
    XYZ_5 TMZ.xml
    "Fruits/Mango/Backup/Apple"

    id=LDL name=Banana DB Market-Connect type=FruitMarket
    XYZ_6 LDL.xml
    XYZ_7 LDL.xml
    XYZ_8 LDL.xml
    XYZ_9 LDL.xml
    XYZ_10 LDL.xml
    XYZ_11 LDL.xml
    "Fruits/Mango/#Common"
    "Fruits/Mango/#Common"
    "VEG/Mango/#NOT"
    "Fruits/Mango/#Common"
    "Fruits/Mango/#NOT"
    "Fruits/Mango/#Common"

使用 shell 脚本(awk、sed、bash),我想对齐为(最终输出)-

   id=ABC name=Banana DB Connection type=FruitMarket
    XYZ_1 ABC.xml "Fruits/Mango/#Common"
    XYZ_2 ABC.xml "Fruits/Mango/#Bizzare"
    XYZ_3 ABC.xml "Fruits/Mango/#Common"

    id=EFG name=FruitHouse type=jms
    XYZ_4 EFG.xml "Fruits/Plum Orange"

    id=JKL name=JMSWriteConnect type=jms
    XYZ_4 JKL.xml "Fruits/Plum Orange"

    id=TMZ name=Banana DB Connection type=FruitMarket
    XYZ_5 TMZ.xml "Fruits/Mango/Backup/Apple"

    id=LDL name=Banana DB Market-Connect type=FruitMarket
    XYZ_6 LDL.xml "Fruits/Mango/#Common"
    XYZ_7 LDL.xml "Fruits/Mango/#Common"
    XYZ_8 LDL.xml "VEG/Mango/#NOT"
    XYZ_9 LDL.xml "Fruits/Mango/#Common"
    XYZ_10 LDL.xml "Fruits/Mango/#NOT"
    XYZ_11 LDL.xml "Fruits/Mango/#Common"

行中的空格并不重要。任何线索都会有所帮助。

答案1

假设每条记录总是只有一个标题行(id/name/type),并且记录主体由相等数量的 XYZ_n LDL.xml 行和类别(水果/蔬菜)行组成,您可以使用 GNU awk ( gawk) 在段落模式下,获取行/变量/协进程pr与两列分页命令进行通信:

  BEGIN {
    RS = ""; FS = "\n"; 
    cmd = "pr -T -s -2"
  }
  {
    print $1; 
    for(i=2;i<=NF;i++) 
      print $i |& cmd; 
    close(cmd,"to"); 
    while((cmd |& getline line) > 0) 
      print line; 
    close(cmd); 
    print ""
  }
' file
   id=ABC name=Banana DB Connection type=FruitMarket
    XYZ_1 ABC.xml       "Fruits/Mango/#Common"
    XYZ_2 ABC.xml       "Fruits/Mango/#Bizzare"
    XYZ_3 ABC.xml       "Fruits/Mango/#Common"

    id=EFG name=FruitHouse type=jms
    XYZ_4 EFG.xml       "Fruits/Plum Orange"

    id=JKL name=JMSWriteConnect type=jms
    XYZ_4 JKL.xml       "Fruits/Plum Orange"

    id=TMZ name=Banana DB Connection type=FruitMarket
    XYZ_5 TMZ.xml       "Fruits/Mango/Backup/Apple"

    id=LDL name=Banana DB Market-Connect type=FruitMarket
    XYZ_6 LDL.xml       "Fruits/Mango/#Common"
    XYZ_7 LDL.xml       "Fruits/Mango/#Common"
    XYZ_8 LDL.xml       "VEG/Mango/#NOT"
    XYZ_9 LDL.xml       "Fruits/Mango/#Common"
    XYZ_10 LDL.xml      "Fruits/Mango/#NOT"
    XYZ_11 LDL.xml      "Fruits/Mango/#Common"

答案2

珀尔:

perl -00 -F'\n' -anE '
    $n = ($#F + 1)/2;
    say $F[0];
    say $F[$_], $F[$_+$n] for (1..$n);
    say "";
' raw
  • -00按段落分割文件
  • -F'\n'使用换行符作为字段分隔符
  • -a将记录“自动拆分”为存储在 @F 数组中的字段
  • -n循环遍历文件中的记录
   id=ABC name=Banana DB Connection type=FruitMarket
    XYZ_1 ABC.xml    "Fruits/Mango/#Common"
    XYZ_2 ABC.xml    "Fruits/Mango/#Bizzare"
    XYZ_3 ABC.xml    "Fruits/Mango/#Common"

    id=EFG name=FruitHouse type=jms
    XYZ_4 EFG.xml    "Fruits/Plum Orange"

    id=JKL name=JMSWriteConnect type=jms
    XYZ_4 JKL.xml    "Fruits/Plum Orange"

    id=TMZ name=Banana DB Connection type=FruitMarket
    XYZ_5 TMZ.xml    "Fruits/Mango/Backup/Apple"

    id=LDL name=Banana DB Market-Connect type=FruitMarket
    XYZ_6 LDL.xml    "Fruits/Mango/#Common"
    XYZ_7 LDL.xml    "Fruits/Mango/#Common"
    XYZ_8 LDL.xml    "VEG/Mango/#NOT"
    XYZ_9 LDL.xml    "Fruits/Mango/#Common"
    XYZ_10 LDL.xml    "Fruits/Mango/#NOT"
    XYZ_11 LDL.xml    "Fruits/Mango/#Common"

答案3

使用paste,grepsed:

paste -d ' '\
 <(grep -v '"' file)\
 <(grep -v '\.xml' file | sed 's/^[[:blank:]]*//;s/id=.*//')

第一个grep获取所有不带双引号的行。这些是空行,包含 ID 和 XML 文件名的行。第二个grep获取不包含 XML 文件名的所有行。前导空格字符/制表符和以 开头的字符串id=将被删除。paste使用空格字符作为分隔符将两个结果组合在一起。

答案4

一个awk版本

awk -v RS="" -v FS="\n" '{print $1; for (i=2; i<=((NF+1)/2); i+=1)
    {print $i, $((NF+1)/2+i-1)}; print "\n"}' file

相关内容