我有一个 awk 脚本,可以根据特定规则生成几列的摘要。我需要在运行 PowerShell v4 的 Windows 服务器上实现相同的功能。
TxnStat|Station|Ccy|Fcy|Lcy|Date
NEW|BRANCH|USD|1000|10|20190410
NEW|ATM|GBP|100|25|20190410
NEW|ATM|GBP|50|10|20190410
NEW|BRANCH|GBP|200|47|20190410
NEW|BRANCH|USD|250|20|20190410
这个想法是将共享同一车站(字段 2)和货币(字段 3)的所有记录分组,然后:
- 计算该组的财务金额总和(字段 4)
- 计算该组的 LCY 金额总和(字段 5)
- 计算字段5/字段4(即汇率)
- 获取平均汇率(汇率/每组商品数量)
我想要的结果是这样的。我正在删除标题:
NEW,BRANCH,USD,30.00,1250.00,0.02,0.01,20190410
NEW,ATM,GBP,35.00,150.00,0.23,0.12,20190410
NEW,BRANCH,GBP,47.00,200.00,0.23,0.23,20190410
我使用下面的 awk 脚本来实现这一点:
tail -n+2 TEST.CSV | awk 'BEGIN{FS="|";OFS=","}
{fcy_tot[$2$3]+=$4;num_recs[$2$3]+=1;
lcy_tot[$2$3]+=$5;str_1[$2$3]=$1OFS$2OFS$3;
dt_str[$2$3]=$6;}
END
{for (i in fcy_tot)
{exch=lcy_tot[i]/fcy_tot[i];
avg_rate=exch/num_recs[i];
printf "%s %.2,%.2f,%.2f,%.2f,%.2f,%s\n",
str_1[i],lcy_tot[i],fcy_tot[i],exch,avg_rate,dt_str[i]}}'
这是一行代码,但为了清楚起见,我在这里把它分解了。
我至今做了什么?
我谷歌了一下,找到了这个group-object
函数,但使用该命令只能找到一个字段的总和measure -sum
。我需要有关如何在多个特定字段上创建自定义计算的指导。
答案1
假设输入文件为上述格式的 input.txt,
此脚本(也可以格式化为单行):
Import-Csv .\input.txt -Delim '|'|Group-Object Station,Ccy | ForEach-object {
$Fcy,$Lcy = ($_.Group|Measure-Object Fcy,lcy -Sum).Sum
$ExchRate = $Lcy / $Fcy
$AvgExchR = $ExchRate / $_.Count
[PSCustomObject]@{
TxnStat = $_.Group[0].TxnStat
Station = $_.Group[0].Station
Ccy = $_.Group[0].Ccy
Fcy = $Fcy.ToString('0.00')
Lcy = $Lcy.ToString('0.00')
ExchRate= $ExchRate.ToString('0.00')
AvgExchR= $AvgExchR.ToString('0.00')
Date = $_.Group[0].Date
}
} | Format-Table -AutoSize # Format-Object just to visualize
产量:
TxnStat Station Ccy Fcy Lcy ExchRate AvgExchR Date
------- ------- --- --- --- -------- -------- ----
NEW BRANCH USD 1250.00 30.00 0.02 0.01 20190410
NEW ATM GBP 150.00 35.00 0.23 0.12 20190410
NEW BRANCH GBP 200.00 47.00 0.24 0.24 20190410
看起来您在样本输出中交换了 Fcy、Lcy。
要输出为不带标题的 csv,请将以下Format-Table
内容替换为
| ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1 | Set-Content output.csv
但是Export-Csv
/ConvertTo-Csv
双引号所有字段
"NEW","BRANCH","USD","1250.00","30.00","0.02","0.01","20190410"
"NEW","ATM","GBP","150.00","35.00","0.23","0.12","20190410"
"NEW","BRANCH","GBP","200.00","47.00","0.24","0.24","20190410"
只要稍加努力,这个问题也是可以解决的。
我没有明确检查我是否使用了 PSv4 后命令。
## Q:\Test\2019\04\10\SU_1423881.ps1
$Output = Import-Csv .\input.txt -Delim '|' | Group-Object Station,Ccy | ForEach-object {
$Fcy,$Lcy = ($_.Group | Measure-Object Fcy,lcy -Sum).Sum
$Prod = [Double] 0; ($_.Group|ForEach-Object{$Prod+=[Double]$_.Fcy*$_.Lcy})
$ExchRate = $Lcy / $Fcy
$AvgExchR = $ExchRate / $_.Count
[PSCustomObject]@{
TxnStat = $_.Group[0].TxnStat
Station = $_.Group[0].Station
Ccy = $_.Group[0].Ccy
Fcy = $Fcy.ToString('0.00')
Lcy = $Lcy.ToString('0.00')
ExchRate= $ExchRate.ToString('0.00')
AvgExchR= $AvgExchR.ToString('0.00')
AvgWeigh= ($Prod / $Fcy).ToString('0.00')
Date = $_.Group[0].Date
}
}
$Output | Format-Table -AutoSize
#$Output | ConvertTo-Csv -Not | Select-Object -Skip 1 | Set-Content output.csv
示例输出:
> .\SU_1423881.ps1
TxnStat Station Ccy Fcy Lcy ExchRate AvgExchR AvgWeigh Date
------- ------- --- --- --- -------- -------- -------- ----
NEW BRANCH USD 1250,00 30,00 0,02 0,01 12,00 20190410
NEW ATM GBP 150,00 35,00 0,23 0,12 20,00 20190410
NEW BRANCH GBP 200,00 47,00 0,24 0,24 47,00 20190410