下面是我的文件的前 5 行。在这里,我想将第五列的“10,00,000.0”替换为“10,000,000.0”。
DE000A2200V7,09:30:00,8.5,8.509,"10,00,000.0","10,00,000.0","850,450.0"
DE000A2200V7,11:30:00,8.7,8.709,"20,00,000.0","20,000.0","870,450.0"
DE000A2200V7,13:30:00,8.763,8.883,"30,00,000.0","20,000.0","882,300.0"
DE000A2200V7,15:30:00,8.481,8.501,"10,00,000.0","10,00,000.0","849,100.0"
DE000A2200W5,09:30:00,15.826,15.835,"20,000.0","20,000.0","1,583,050.0"
答案1
通过使用from将 CSV 分隔符临时更改为 a @
(或尚未属于数据的任何其他字符)csvformat
csvkit
,然后用 更改第 5 个字段中的相关字符串awk
,并将分隔符返回到原来的逗号:
csvformat -D '@' data.csv |
awk 'BEGIN { OFS=FS="@" } $5 == "10,00,000.0" { $5 = "10,000,000.0" }; 1' |
csvformat -d '@'
使用您的数据data.csv
,这会产生:
DE000A2200V7,09:30:00,8.5,8.509,"10,000,000.0","10,00,000.0","850,450.0"
DE000A2200V7,11:30:00,8.7,8.709,"20,00,000.0","20,000.0","870,450.0"
DE000A2200V7,13:30:00,8.763,8.883,"30,00,000.0","20,000.0","882,300.0"
DE000A2200V7,15:30:00,8.481,8.501,"10,000,000.0","10,00,000.0","849,100.0"
DE000A2200W5,09:30:00,15.826,15.835,"20,000.0","20,000.0","1,583,050.0"
答案2
您可以使用以下sed
命令来执行此操作:
sed -i 's/^\(\([^,]*,\)\{4\}\)\("[^"]*"\)\(.*\)$/\1"10,000,000.0"\4/' data.csv
答案3
确定“20,00,000.0”正确吗?如果没有,请尝试
sed 's/,00,/,000,/' file
如果所有错误的号码都应该被纠正,请添加s
的标志...g
答案4
与GNUawk
awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, '$5 == "\"10,00,000.0\"" \
{ $5="\"10,000,000.0\""}; {print}' file
测试
$ cat file
DE000A2200V7,09:30:00,8.5,8.509,"10,00,000.0","10,00,000.0","850,450.0"
DE000A2200V7,11:30:00,8.7,8.709,"20,00,000.0","20,000.0","870,450.0"
DE000A2200V7,13:30:00,8.763,8.883,"30,00,000.0","20,000.0","882,300.0"
DE000A2200V7,15:30:00,8.481,8.501,"10,00,000.0","10,00,000.0","849,100.0"
DE000A2200W5,09:30:00,15.826,15.835,"20,000.0","20,000.0","1,583,050.0"
$ awk -vFPAT='([^,]*)|("[^"]+")' -vOFS=, '$5 == "\"10,00,000.0\"" { $5="\"10,000,000.0\""}; {print}' file
DE000A2200V7,09:30:00,8.5,8.509,10,000,000.0,"10,00,000.0","850,450.0"
DE000A2200V7,11:30:00,8.7,8.709,"20,00,000.0","20,000.0","870,450.0"
DE000A2200V7,13:30:00,8.763,8.883,"30,00,000.0","20,000.0","882,300.0"
DE000A2200V7,15:30:00,8.481,8.501,10,000,000.0,"10,00,000.0","849,100.0"
DE000A2200W5,09:30:00,15.826,15.835,"20,000.0","20,000.0","1,583,050.0"
解释
-vFPAT='([^,]*)|("[^"]+")'
用逗号分隔字段,处理字段可能包含嵌入逗号的情况(请参阅 GNU awk 手册按内容定义字段)。
-vOFS=,
声明输出文件分隔符是逗号,
。
'$5 == "\"10,00,000.0\"" { $5="\"10,000,000.0\""}; {print}'
如果第五列匹配字符串“10,00,000.0”,则将其替换为“10,000,000.0”;打印该行。