匹配文本文件中的缺失值

匹配文本文件中的缺失值

参考这个问题: 查找文本文件中缺失的值

我有 2 个包含以下数据的文件

例子

Name             Feature
Marry            Lecturer
Marry            Student
Marry            Leader
Bob              Lecturer
Bob              Student
Som              Student

特征

 Lecturer
 Student
 Leader 

我按照下面的代码查找示例文件中任何名称缺少的功能:

#!/bin/bash
rm -f *.missing names.all
feature=feature
sed -n '1!p' example.txt | cut -d ' ' -f 1 | sort -u > names.all
for i in $(cat $feature)
do
  fgrep $i example.txt | cut -d ' ' -f 1 | cat - names.all | sort | uniq -u >  $i.missing 
done 

这段代码给了我 3 个不同的文件,比如讲师.missing、学生.missing 和leader.missing,并包含所有不具有此功能的名称。

但我希望数据位于同一个文件中,并且输出应该是:

我需要这样的输出:

Lecturer   Student   Leader
  Som                 bob
                      Som

我尝试在同一文件中附加数据,但它不起作用。

答案1

这段代码

awk '
  NR == FNR {feature[$1]=1; next} 
  $1 != "Name" {name[$1]=1; role[$1,$2]=1} 
  END {
    for (f in feature)
      printf "%-12s", f
    print ""
    for (n in name) { 
      for (f in feature) 
        printf "%-12s", (n SUBSEP f in role ? " " : n)
      print ""
    }
  }
' features roles 

给出这个输出

Lecturer    Student     Leader      

                        Bob         
Som                     Som         

足够接近?

答案2

正文内所有评论

awk '
  # make array with FEATURE elements from file "feature"
  FNR==NR{f[$1]=1;next}
  # collect to array all FEATUREs NAME by NAME
  FNR>1{e[$1]=e[$1]" "$2}
  # loop for each element in FEATURE array
  END{for (i in f) {
        # produce a head row with FEATURE elements
        r[0]=r[0] i" "
        # starts row counts for each FEATURE elements
        c=0
        # call all NAMEs 
        for (n in e)
          # check if the FEATURE do not exist for the NAME  
          if(e[n] !~ i){
            # produce next row number 
            ++c
            # construct apropriate row
            if(c in r)
              # if row exist add value to it
              r[c]=r[c] " " n
            else
              # if not exist put apropriate spaces before value
              r[c]=s n
            # find maximum row number between all FEATUREs
            if(c>l)
              l=c
          }
        # make shift in row for next FEATURE  
        s=s" "
      }
      # prints row by row
      for (k=0;k<=l;k++)
        print r[k]
  }' feature example | column -tn

相关内容