如何创建没有不必要空格的 CSV 文件?

如何创建没有不必要空格的 CSV 文件?

我正在使用该xls2csv二进制文件在我的 Red Hat Linux 计算机上将 XLS 文档转换为 CSV。

例如:(来自手册页):

 xls2csv -x "1252spreadsheet.xls" -b WINDOWS-1252 -c "ut8csvfile.csv" -a UTF-8

但我注意到以下几点导致我的 Bash 脚本出现问题:

  1. CSV 输出包含不必要的空格(在单词的左侧或单词的右侧)

    CSV 中错误语法的示例:

     ,"/var/adm/sys ldd/all  /Comm/logs   ","WORD "," WORD"
    

    CSV 中正确语法的示例:

     ,"/var/adm/sys ldd/all  /Comm/logs",WORD,WORD
    
  2. 当不需要时,引号会出现在 CSV 中:

    CSV 中错误语法的示例:

     ," WORD ",
    

    csv 中正确语法的示例

     ,WORD,
    

如何更改输出以创建“干净”的 CSV 文件?


我正在寻找 awk/sed/perl oneliner,或者任何其他可以在 Bash 脚本中工作的解决方案。

修复前的 CSV 文件示例:

 1,"/var/adm/sys ldd/all  /Comm/logs",34356,"234245 ",24245
 2,"/var/adm/sys ldd/all
 /Comm/debugs.txt"," 45356",435,"  578 58976  "
 3,"   add this line in crontab    :",34356,"234245 ",24245
 4,"1.0348    54 35.5"," 45356","   435","578 "
 4,"1 2 "," 45356 95857 ","   435","578 "
 5,"1 2 "," 45356 95857 ","   "435","578" "
 6,"1.0348    54 35.5"," 45356"," "4"""    ""35","578 "
 7,"1.0348    54 35.5",""45356",""4"""""35,"578 "

更正后的 CSV 文件示例(修复后):

 1,"/var/adm/sys ldd/all  /Comm/logs",34356,234245,24245
 2,"/var/adm/sys ldd/all
 /Comm/debugs.txt",45356,435,"578 58976"
 3,"add this line in crontab    :",34356,234245,24245
 4,"1.0348    54 35.5",45356,435,578 
 4,"1 2","45356 95857",435,578
 5,"1 2","45356 95857","435,578" 
 6,"1.0348    54 35.5",45356,"4"""    ""35,578
 7,"1.0348    54 35.5",""45356",""4"""""35,578

字段中不能出现逗号。

请注意 字段中包含的显式换行符line 2

当字段位于双引号内并且不包含空格(例如第 7 行""45356")时,不得删除这些双引号,因为包括这些引号的整个字段都是编码密码。

答案1

此 Perl 代码产生几乎完全符合预期的输出:

use Text::CSV;

my $csv = Text::CSV->new({ binary => 1, eol => $/, allow_loose_quotes => 1, escape_char => undef });

open my $io, "<", $ARGV[0] or die;

while (my $row = $csv->getline ($io)) {
        my @o = map { $_ =~ s,^\s*,,; $_ =~ s,\s*$,,; $_; } @{$row};
        $csv->print(STDOUT, \@o);
}

输出是

1,"/var/adm/sys ldd/all  /Comm/logs",34356,234245,24245
2,"/var/adm/sys ldd/all
/Comm/debugs.txt",45356,435,"578 58976"
3,"add this line in crontab    :",34356,234245,24245
4,"1.0348    54 35.5",45356,435,578
4,"1 2","45356 95857",435,578
5,"1 2","45356 95857",""435","578""
6,"1.0348    54 35.5",45356,""4"""    ""35",578
7,"1.0348    54 35.5",""45356",""4"""""35,"578"

相关内容