我有一个aa.csv
文件如下:
"ID0054XX","PT. SUMUT","18 JL.BONJOL","SUMATERA UTARA, NORTH","MEDAN","","ID9856","PDSUIDSAXXX","","","","Y"
"ID00037687","PAN INDONESIA, PT.","JALAN JENDERAL, SUDIRMAN, SENAYAN","","INDIA","","ID566543","PINBIDJAXXX","","0601","","Y"
我有一个脚本,它将每个逗号分隔值分配给用作,
分隔符的唯一变量。
脚本部分如下:
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read Key Name Address1 Address2 City State Country SwiftCode Nid Chips Aba IsSwitching
do
echo "-------------------------------------------------------------------"
echo "From Key : $Key"
echo "-------------------------------------------------------------------"
echo "-------------------------------------------------------------------"
echo "From Name : $Name"
它的作用是将引号内有逗号的值与我想要的输出分隔开,即唯一地将每个值与其各自的变量分隔开。
我尝试替换逗号,IFS=[","]
但没有成功。非常感谢任何建议/帮助。
答案1
你在这里做错了几件事:
-
虽然这是可能的,但效率非常低。它很慢,很难写,很难读,而且很难正确执行。 shell 并不是为这类事情而设计的。
您正在尝试在没有 csv 解析器的情况下解析 csv 文件。
CSV 不是一种简单的格式。您可以像此处一样拥有包含分隔符的字段。您还可以拥有跨越多行的字段。尝试使用简单的模式匹配来解析任意 CSV 数据是非常非常复杂的,而且极难正确执行。
糟糕的、hacky 的解决方案是这样做:
$ sed 's/","/"|"/g' file.csv |
while IFS='|' read -r Key Name Address1 Address2 City \
State Country SwiftCode Nid Chips Aba IsSwitching; do
echo "From Key : $Key"; echo "From Name : $Name";
done
From Key : "ID0054XX"
From Name : "PT. SUMUT"
From Key : "ID00037687"
From Name : "PAN INDONESIA, PT."
这将替换所有","
,"|"
然后用作|
分隔符。当然,如果您的任何字段可以包含|
.
好的、干净的方法是使用适当的脚本语言(而不是 shell)和 csv 解析器。例如,在 Perl 1中:
$ cat file.csv | perl -MText::CSV -le '
$csv = Text::CSV->new({binary=>1});
while ($row = $csv->getline(STDIN)){ my ($Key, $Name, $Address1, $Address2, $City, $State, $Country, $SwiftCode, $Nid, $Chips, $Aba, $IsSwitching) = @$row;
print "From Key: $Key\nFrom Name: $Name";}'
From Key: ID0054XX
From Name: PT. SUMUT
From Key: ID00037687
From Name: PAN INDONESIA, PT.
或者,作为脚本:
#!/usr/bin/perl -l
use strict;
use warnings;
use Text::CSV;
open(my $fh, "file.csv");
my $csv = Text::CSV->new({binary=>1});
while (my $row = $csv->getline($fh)){
my (
$Key, $Name, $Address1, $Address2, $City,
$State, $Country, $SwiftCode, $Nid, $Chips,
$Aba, $IsSwitching
) = @$row;
print "From Key: $Key\nFrom Name: $Name";
}
请注意,您必须Text::CSV
先安装该模块 ( cpanm Text::CSV
),并且您可能需要安装cpanm
(cpanminus
大多数发行版上的软件包)
或者,在 Python 3 中:
#!/usr/bin/env python3
import csv
with open('file.csv', newline='') as csvfile:
linereader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in linereader:
print("From Key: %s\nFrom Name: %s" % (row[0], row[1]))
将上面的 Python 代码保存为脚本并在文件上执行将打印:
$ foo.py
From Key: ID0054XX
From Name: PT. SUMUT
From Key: ID00037687
From Name: PAN INDONESIA, PT.
1是的,我知道这是一个 UUoC,但用这种方式写成一行更简单。