我有两个 csv 文件,当前文件 New.csv 和以前版本的 Old.csv。它们是这样的:
旧的.csv
name,age,lastname,film,song,mother,fadher,col0,col1,col3,col4,col5,col6,col7,a,b,z,t
jay,23,,stgh,tt,,,,,,,,,,,
Ann,32,,,,,,,,,,,,,,,,
Chris,43,titanic,hi,,,,,,,
新的.csv
name,age,lastname,film,song,mother,fadher,col0,col1,col3,col4,col5,col6,col7,a,b,z,t
jay,23,,stgh,tt,,,,,,,,,,,
alex,22,,hello,,,,,,,,,,,jed,,,
我想使用 linux 命令来比较它们,发现了这样的结果:
status,name,age,lastname,film,song,mother,fadher,col0,col1,col3,col4,col5,col6,col7,a,b,z,t
Common,jay,23,,stgh,tt,,,,,,,,,,,
New,alex,22,,hello,,,,,,,,,,,jed,,,
Old,Ann,32,,,,,,,,,,,,,,,,
Old,Chris,43,titanic,hi,,,,,,,
答案1
使用awk
一种方法:
awk 'NR==FNR && NR>1{seen[$0]++; next}
NR==1{ print "Status," $0}
FNR!=1{print ($0 in seen)?"Common," $0:"New," $0;delete seen[$0];}
END{for (x in seen) print "Old," x}' old.csv new.csv
输出:
Status,name,age,lastname,film,song,mother,fadher,col0,col1,col3,col4,col5,col6,col7,a,b,z,t
Common,jay,23,,stgh,tt,,,,,,,,,,,
New,alex,22,,hello,,,,,,,,,,,jed,,,
Old,Chris,43,titanic,hi,,,,,,,
Old,Ann,32,,,,,,,,,,,,,,,,