转置/旋转 csv 文本文件

转置/旋转 csv 文本文件

我需要转置/旋转 CSV 文件。不知道这是否可能

假设这个 CSV 文件内容:

filename;rating;id;summary
S4348gjO.doc;good;0001;describing how to reach your goals
S4348gjO.doc;good;0002;some recipes for avoiding an argument
S4348gjO.doc;bad;0003;boring part of the page
A234HK.doc;fairly good;0001;how to deploy a server
A234HK.doc;bad;0002;start and stop the server

输出必须是:

filename;good;fairly good;bad;id
S4348gjO.doc;describing how to reach your goals; ; ;0001
S4348gjO.doc;some recipes for avoiding an argument; ; ;0002
S4348gjO.doc; ; ;boring part of the page;0003
A234HK.doc; ;how to deploy a server; ;0001
A234HK.doc; ; ;start and stop the server;0002

答案1

看起来你想要的是类似这样的东西:

awk 'BEGIN{FS=OFS=";"}
FNR==1{print "filename;good;fairly good;bad;id"}
$2=="good"{print $1, $4, " ", " ", $3}
$2=="fairly good"{print $1, " ", $4, " ", $3}
$2=="bad"{print $1, " ", " ", $4, $3}' infile

因此相应的评级列包含摘要,其他两个将只得到一个空格(根据您的示例 - 如果您需要一个空字段,请替换" """

答案2

稍微修改一下don_crissti 的脚本

awk -F\; '
    BEGIN{
        P["good"]="%s;%s;;;%s\n"
        P["fairly good"]="%s;;%s;;%s\n"
        P["bad"]="%s;;;%s;%s\n"
        }                         
    FNR==1{
        print "filename;good;fairly good;bad;id"
        next
        }
    {
        printf(P[$2],$1,$4,$3)
        }
    ' infile

答案3

使用reshapeMiller ( ) 的子命令将字段作为键字段、字段作为值字段mlr来旋转数据。然后将空值分配给每个记录中缺少的新字段,并按照问题中显示的顺序对字段重新排序:ratingsummary

$ mlr --csv --fs ';' reshape -s rating,summary then unsparsify then reorder -f filename,good,'fairly good',bad,id file
filename;good;fairly good;bad;id
S4348gjO.doc;describing how to reach your goals;;;0001
S4348gjO.doc;some recipes for avoiding an argument;;;0002
S4348gjO.doc;;;boring part of the page;0003
A234HK.doc;;how to deploy a server;;0001
A234HK.doc;;;start and stop the server;0002

用空格填充缺失值,如问题所示:

$ mlr --csv --fs ';' reshape -s rating,summary then unsparsify --fill-with ' ' then reorder -f filename,good,'fairly good',bad,id file
filename;good;fairly good;bad;id
S4348gjO.doc;describing how to reach your goals; ; ;0001
S4348gjO.doc;some recipes for avoiding an argument; ; ;0002
S4348gjO.doc; ; ;boring part of the page;0003
A234HK.doc; ;how to deploy a server; ;0001
A234HK.doc; ; ;start and stop the server;0002

相关内容