我有一个名为file.csv
多行和多列的文件,如下所示:
API,20042017-01:00,341701,341701,480692,480692
API,20042017-02:00,293058,293058,415459,415459
API,20042017-03:00,272692,272692,388942,388942
API,20042017-04:00,279117,279115,399361,399361
API,20042017-05:00,345947,345945,495306,495306
我想通过将第 4 列与第 3 列的比率乘以 100 来计算百分比值,因此我输入以下命令:
awk -F, '{ print $1, $2, $3, $4, ($4/$3*100), $5, $6 }' file.csv
这给了我所需的输出:
API,20042017-01:00,341701,341701,100,480692,480692
API,20042017-02:00,293058,293058,100,415459,415459
API,20042017-03:00,272692,272692,100,388942,388942
API,20042017-04:00,279117,279115,100,399361,399361
API,20042017-05:00,345947,345945,100,495306,495306
但是当第 3 列中有一个非整数时,它会给我一个错误:
awk: (FILENAME=file.csv FNR=3) fatal: division by zero attempted
并停止计算其余行。
我怎样才能让它继续下去?
答案1
您可以使用 ~ /^[0-9]+/ 要求 awk 验证字段是否为数字。
下面是一个小 shell 脚本来演示这一点:
[root@tiny ~]# cat test.sh
#!/bin/bash
INPUT="API,20042017-01:00,341701,341701,100,480692,480692
API,20042017-02:00,293058,293058,100,415459,415459
API,20042017-03:00,272692,272692,100,388942,388942
API,20042017-04:00,279117,279115,100,399361,399361
API,20042017-04:00,279117,FRED,100,399361,399361
API,20042017-05:00,345947,345945,100,495306,495306"
echo "$INPUT" | awk -F, '$3 ~ /^[0-9]+/ && $4 ~ /^[0-9]+/ { print $1, $2, $3, $4, ($4/$3*100), $5, $6 }'
[root@tiny ~]# ./test.sh
API 20042017-01:00 341701 341701 100 100 480692
API 20042017-02:00 293058 293058 100 100 415459
API 20042017-03:00 272692 272692 100 100 388942
API 20042017-04:00 279117 279115 99.9993 100 399361
API 20042017-05:00 345947 345945 99.9994 100 495306
[root@tiny ~]#