根据 csv 文件按州和性别计算并打印累积金额的摘要

根据 csv 文件按州和性别计算并打印累积金额的摘要

像这样的事情我该怎么办?

Purchase Summary Report
State Gender Purchase Amount
TX F 33734.33
CA F 23911.61
TX M 23043.64
FL M 18846.49

我需要对总交易进行求和(小数点后两位)并按每个州的男性和女性进行排序?

数据

customer_id,first_name,last_name,email,gender,purchase_amount,credit_card,transaction_id,transaction_date,street,city,state,zip,phone

例子

$ cat infile
c77ea913-c4a2-4ff6-b615-26028b375fda,wye,nnnnnnnnn,[email protected],F,23911.61,dddddddddddddddd,8a412563-2956-4bf3-9df3-54277a9799c7,yyyy-mm-dd,ss ssss sssssss, ccccccccc,ca,93000,ddd-ddd-2825
c117f14d-1322-45a9-9f9b-63db5105fd6c,melany,nnnnnnnnn,[email protected],F,33100.03,dddddddddddddddd,a59201e8-9d3c-4f2b-b1fb-6914895fa5cd,yyyy-mm-dd,aa aaaaaaaaaaa aaa, ccccccccc,tx,73000,ddd-ddd-2869
3153178c-a3fb-44d3-896f-f39da69fe826,nancy,nnnnnnnnnn,[email protected],F,634.30,dddddddddddddddd,4724a9c3-fc8a-49d3-956a-b0646364d3ef,yyyy-mm-dd,aaa aaaaa aaaaa,cccccccccccc,tx,73000,ddd-ddd-8746
e4eb28e6-f960-49eb-8f5d-361df60f00c7,jerome,nnnnnnnnn,[email protected],M,23043.64,dddddddddddddddd,af393a75-6298-4c82-84e0-a778ec9956a2,yyyy-mm-dd,aa aaa aaaaa aaaaaaa,cccccccccccccccc,tx,73000,ddd-ddd-3204
b5c32ded-e769-4c52-a553-e34b548280d0,damiano,nnnnnnnnn,[email protected],M,18846.49,dddddddddddddddd,ff06abad-6787-4236-856d-74ec49df3d04,yyyy-mm-dd,aaaa aa aaaaaaaa,cccccccccccc,fl,32000,ddd-ddd-5361
$ awk 'BEGIN{ FS=OFS="," }
    { arr_state[toupper($12) toupper($5)]+=$6 }
END {
    print "Transaction Count Report"
    for (id in arr_state) {
        printf "%-10s %d\n", id, arr_state[id]
    }
}' infile

我的结果是

Transaction Count Report
State Gender Amount
SCM        2322
MOF        9015
TNM        6671
OR0        3470
WI0        3295
MS0        2640
AZ0        7747
NM0        1090
MD0        2943
AK0        2541

答案1

$ awk 'BEGIN {FS=",";print "State Gender Purchase Amount"} 
       {assoc_arr[toupper($12)$5]+=$6} 
       END {for (id in assoc_arr) {
           printf "%-8s %-8s %8.2f\n", substr(id,1,2), substr(id,3), assoc_arr[id]
           }
       }' infile

State Gender Purchase Amount
TX       M        23043.64
CA       F        23911.61
TX       F        33734.33
FL       M        18846.49
  • 关联数组 assoc_arr 可以用来保存任何复合索引值。在这里,它是toupper($12)$5,但您可以链接任何可用awk记录的字段。
  • %f在 awk oneliner 内的 printf 操作中使用“浮点”数字格式(如),以便能够打印带小数的金额。

相关内容