如何在列值前面加上撇号 ( ' )?

如何在列值前面加上撇号 ( ' )?

我有一个包含多列和 1000 条记录的 CSV 文件,我需要在其中一列(比如说第二列)的所有值前面加上撇号'除了第一行或标题行之外,可能有一个简单的行。我如何使用awkor来实现这一目标sed?请注意,我可能在用双引号括起来的值中有多个逗号。

样本数据:

"col1","col2","col3","col4","col5"
"value11","value12","value13","value14","value15"
"value21","value22","value23","value24","value25"
"value31","value32","value33","value34","value35"

预期输出:

"col1","col2","col3","col4","col5"
"value11","'value12","value13","value14","value15"
"value21","'value22","value23","value24","value25"
"value31","'value32","value33","value34","value35"

答案1

sed:

sed '2,$s/^\("[^"]*","\)/\1'"'"/ test.in

使用 ERE 消除一些转义:

sed -E '2,$s/^("[^"]*",")/\1'"'"/ test.in

awk:

awk -F, 'NR>1{sub(/^"/,"\"'"'"'",$2)}1' test.in

如果您不想担心引用,请使用转义码:

awk -F, '{sub(/^"/,"\"\x27",$2)}1' test.in

答案2

使用 Perl:

perl -pi -e '
             BEGIN{
                 $column_number = 2; # Change as needed
                 $column_number--;
                 $apostrophe = chr 39;
             }
             next unless $this_is_data++; # Skip the first line
             s@ ^((?:"[^"]+"\s*,){$column_number}) "@$1"$apostrophe@x
           ' your_file

这假设您的字段不包含反斜杠转义的引号。

答案3

这是一个傻瓜:

$ gawk -F'","' -v var="'" -v OFS='","' 'NR>1{$2=var$2;} 1' foo.csv 

-v选项允许您定义脚本可访问的变量gawk。在这种情况下,varis'OFS(输出字段分隔符)是",",与输入字段分隔符 ( -F) 相同。然后我们检查这不是第一行 ( NR>1) 并将 的值添加var到第二列。最后,这1只是一个技巧,它的计算结果为 true,这使得gawk打印该行。相当于加了一个,print;但更短。

如果您想在不同的列上运行此操作,只需更改$2=var$2;为您感兴趣的列号$N=var$N即可。N


你也可以在 perl 中做到这一点(当然,你可以这样做一切在 Perl 中):

$ perl -F'\",\"' -ane '$.>1 && do{$F[1]=chr(39).$F[1]}; 
                       print join("\",\"",@F)' foo.csv

-a开关使 perl 像 gawk 一样分割输入行,只是将它们保存在数组中@F(perl 数组从 0 开始,所以第二列将是$F[1],第三列$F[2]等)。-F(再次类似)设置gawk输入字段分隔符。因此,我们检查行号是否大于一 ( ),如果是,则向其添加(a ,感谢 @josephR)$.>1的值。最后,我们使用连接数组中的每个元素并打印结果字符串。chr 39'join@F","

答案4

一个简单的sed就可以了:

$ sed 's/","/","\x27/' afile
"col1","'col2","col3","col4","col5"
"value11","'value12","value13","value14","value15"
"value21","'value22","value23","value24","value25"
"value31","'value32","value33","value34","value35"

细节

我们正在搜索第一次出现的","并将其替换为","`。然而,转义反引号可能很棘手。因此只需输入其等效的十六进制转义代码即可\x27

你的问题

可以像这样进行调整,以将更改限制为仅您想要的行。

$ cat <(head -n +1 afile) <(tail -n +2 afile | sed 's/","/","\x27/')
"col1","col2","col3","col4","col5"
"value11","'value12","value13","value14","value15"
"value21","'value22","value23","value24","value25"
"value31","'value32","value33","value34","value35"

sed或者,如果您知道技巧 8-),则可以完全跳过第一行:

$ sed '2,$s/","/","\x27/' afile
"col1","col2","col3","col4","col5"
"value11","'value12","value13","value14","value15"
"value21","'value22","value23","value24","value25"
"value31","'value32","value33","value34","value35"

这告诉sed我们只取第二行直到最后一行 ( $) 并通过搜索和替换运行它们。

相关内容