我有两个以逗号分隔的 csv 文件(section_ne_lookup.csv 和section_fru_out.csv):
-sh-4.1$ cat section_ne_lookup.csv
CIBI_NMS_NE,,,,,,,,,,,,,,,,,,,,,,,
D,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w
X,NE1,1.1.1.1,,833-006355,00,Windows,,,9.4.04,301035,,,,,,,,,,,,,
X,NE2,2.2.2.2,,833-006352,00,XS-A,,,5.1,51.573,,,,,,,,,,,,,
X,80,3.3.3.3,,833-006366,00,XS-F,,,5.2,52.604,,,,,,,,,,,,,
X,83,4.4.4.4,,833-006366,00,XS-F,,,5.3,53.575,,,,,,,,,,,,,
X,85,5.5.5.5,,833-006352,00,XS-A,,,5.3,53.605,,,,,,,,,,,,,
X,87,6.6.6.6,,Dummy_EPLC60,Dummy_EPLR60,XS-A,,,5.3,53.543,,,,,,,,,,,,,
-sh-4.1$ cat section_fru_out.csv
CIBI_NMS_FRU,,,,,,,,,,,,,,,,,,,,,,,
D,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w,x,y
X,UNKNOWN,0,NE2,,,,NE2:klm11,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm12,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm13,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm14,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm15,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm16,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm2,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm3,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm4,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm5,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm6,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm7,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm8,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,NE2,,,,NE2:klm9,,,,UNKNOWN,,,,,,,0,,,,,,
X,abc,0,83,,,,83:klm1,,,,rty,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm2,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm3,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm4,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm5,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm6,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm7,,,,UNKNOWN,,,,,,,0,,,,,,
X,UNKNOWN,0,80,,,,80:klm8,,,,UNKNOWN,,,,,,,0,,,,,,
X,TBD,1,NE2,,,,NE2:klm4.asd1,,,,bnm,,,,,,,0,,,,,,0000
X,TBD,1,80,,,,80:.klm1.asd1,,,,bnm,,,,,,,0,,,,,,3132
X,abc,1,87,,,,87:ab1,,,,,,,,,,,0,,,,,,xyz-C
X,abc,0,87,,,,87:abA,,,,bnm,,,,,,,0,,,,,,xyz-A
我想要做的是:如果section_ne_lookup.csv的第五或第六列以“Dummy_”短语开头,则获取匹配行第二列的值。并检查此获取的值(section_ne_lookup.csv 中匹配行的第二列)是否存在于section_fru_out.csv 的第四列中。如果是,请从section_fru_out.csv 中删除匹配行。如果不是,请将匹配行保留在section_fru_out.csv 中。根据此规则,应删除以下行:
X,abc,1,87,,,,87:ab1,,,,,,,,,,,0,,,,,,xyz-C
X,abc,0,87,,,,87:abA,,,,bnm,,,,,,,0,,,,,,xyz-A
因为,section_ne_lookup.csv 中最后 2 行的第 5 和第 6 列以 Dummy_ 短语开头。因此,我们应该获取匹配行的第二列(在本例中为 87),并在文件的第四列中搜索该值 (87):section_fru_out.csv。由于该文件的最后 2 行与此规则匹配,因此应删除这些行。
我怎样才能做到这一点?
答案1
这应该可以满足您的需要:
awk -F, '{
if(NR==FNR){a[$2]=$5; b[$2]=$6;}
else if(a[$4] !~ /^Dummy/ && b[$4] !~ /^Dummy/){print}
}' section_ne_lookup.csv section_fru_out.csv
if(NR==FNR){a[$2]=$5; b[$2]=$6;}
:FNR
是当前输入文件的行号,NR
是所有输入的行号。当它们相等时,我们正在读取第一个文件。然后,我们创建a
关联b
数组,其键为第二个字段,其值分别为第 5 个和第 6 个字段。else if (a[$4] !~ /^Dummy/ && b[$4] !~ /^Dummy/){print}
section_ne_lookup.csv
:如果这不是第一个文件,并且与start with中的第二个字段关联的行的第四个或第五个字段都不是Dummy
,则打印该行。