awk:将 csv 中的日期时间列拆分为三个单独的列

awk:将 csv 中的日期时间列拆分为三个单独的列

我试图使用 substr 将日期时间列,第五个(previous_test)拆分为最后的三个不同的列。

输入:

id,tester,company,chief,previous_test,test,date,result,cost
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96

预期输出:

id,tester,company,chief,previous_test,test,date,result,cost,day,month,year
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96,18,02,2019

我尝试过使用:

awk -F, -v OFS="," '{s = substr($5, 1, 2)} {g = substr($5, 4, 2)} {l = substr($5, 7, 4)} {print s, g, l}' file.csv

我得到的只是用逗号分隔的日期,而不是附加到现有列的三个附加列。

我缺少如何将输出附加到三个单独的列中。

答案1

您的代码仅打印用于新列的子字符串值,而不是现有列。

您需要对第一行进行特殊处理。

awk -F, -v OFS="," 'NR==1 { print $0,"day,month,year"; next }
{ s = substr($5, 1, 2);
  g = substr($5, 4, 2);
  l = substr($5, 7, 4);
  print $0, s, g, l}' file.csv

印刷

id,tester,company,chief,previous_test,test,date,result,cost,day,month,year
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96,18,02,2019

解释:

  • 该条件NR==1对于第一条记录/行有效。
  • $0是整个输入记录/行
  • next命令跳转到下一条记录/行并跳过当前记录/行的所有剩余命令。这意味着将对除第一个记录/行之外的所有记录/行执行其他命令。

编辑:正如评论中所建议的奥利维尔·杜拉克,可以使用该函数简化日期字符串的分割split

awk -F, -v OFS="," 'NR==1 { print $0,"day,month,year"; next }
{ split($5,a,"/"); print $0, a[1], a[2], a[3] }' file.csv

答案2

您不需要 split $5,只需将/s 更改为,s 然后打印即可。使用 GNU awk 可以gensub()

$ awk -F, '{print $0 FS (NR>1 ? gensub("/",FS,"g",$5) : "day,month,year") }' file
id,tester,company,chief,previous_test,test,date,result,cost,day,month,year
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96,18,02,2019

对于任何 awk:

$ awk -F, '{d=$5; gsub("/",FS,d); print $0 FS (NR>1 ? d : "day,month,year") }' file
id,tester,company,chief,previous_test,test,date,result,cost,day,month,year
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96,18,02,2019

答案3

通过以下命令完成

 awk -F "," 'NR==1{print $0",day,month,year"}NR == 2{print $0","substr($5,1,2)","substr($5,4,2)","substr($5,7,4)}' file

输出

id,tester,company,chief,previous_test,test,date,result,cost,day,month,year
6582983b-61d4-4371-912d-bbc76bb8208b,Audrey Feest,Pagac-Gorczany,Claudine Moakson,18/02/2019,Passwords,20/05/2020,none,£11897.96,18,02,2019

相关内容