需要帮助使用 Unix 命令转换 csv 文件格式

需要帮助使用 Unix 命令转换 csv 文件格式

输入文件 :

"Run Date/Time: 2022-02-09 12:47",,,GOOD_MORNING_WORLD
"File Processed: AB-FILE2.20220209.110516",,,GOOD_MORNING_WORLD
AB1234,5,"        PQR2",GOOD_MORNING_WORLD
AB-345,10,"        PQR2",GOOD_MORNING_WORLD
XY890,20,"        PQR2",GOOD_MORNING_WORLD

预期输出文件:

Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516

请帮我实现上述输出格式。

我尝试了下面的命令,但它给出了列和值。我需要从每一行中删除列标题。

awk -F, 'NR==1 {FN = $1; next} NR==2 {DT = $1; next} {print $1,$2,$3,FN,DT,$4}' OFS=, InputFile.csv > InputFile_Int.csv

AB1234,5,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD
AB-345,10,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD
XY890,20,"        PQR2","Run Date/Time: 2022-02-09,12:47",File Processed: AB-FILE2.20220209.110516,GOOD_MORNING_WORLD

提前致谢。

答案1

不要放置FS=OFS= 你的脚本,因为它使你的代码更难阅读,因为人们在阅读你的脚本时假设你有默认的 FS 和/或 OFS 值,并且只有在最后才看到你实际上更改了它。相反,将两者都预先设置,即 doawk -Fx -v OFS=y 'script' fileawk 'BEGIN{FS="x";OFS="y"} script' file、 not awk -Fx 'script' OFS=y file。该规则的唯一例外是,当您需要为不同的输入文件将它们设置为不同的值,然后在输入文件名之间设置其中一个或两个值时。

另外,不要对用户定义的变量使用全大写的变量名称,因为这会混淆您的代码,使您看起来像是在使用内置变量名称,而实际上您并未使用内置变量名称,并且可能会导致您认为定义的变量名称之间发生冲突但实际上正在破坏或被内置变量名称破坏。

$ cat tst.awk
BEGIN {
    FS = "[\"[:space:]]*,[\"[:space:]]*"
    OFS = ","
}
NR < 3 {
    split($1,parts," ")
    if ( NR == 1 ) {
        date = parts[3]
        time = parts[4]
    }
    else {
        file = parts[3]
        print "Codes Produced"," Count"," PQR"," Run Date"," Run Time"," File Processed"
    }
    next
}
{ print $1, $2, "\"" $3 "\"", date, time, file }

$ awk -f tst.awk InputFile.csv
Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516

答案2

尝试

awk -F"[ ,]*" '
        {gsub(/"/,"")
        }
NR==1   {print "Codes Produced, Count, PQR, Run Date, Run Time, File Processed"
         DT = $3
         TM = $4
         next
        }
NR==2   {FN = $3
         next
        }
        {print $1,$2,"\"" $3 "\"",DT,TM,FN
        }
' OFS=, file
Codes Produced, Count, PQR, Run Date, Run Time, File Processed
AB1234,5,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
AB-345,10,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516
XY890,20,"PQR2",2022-02-09,12:47,AB-FILE2.20220209.110516

相关内容