更改字段分隔符和引用字符

更改字段分隔符和引用字符

我想修改两个不同文件的内容。如何在unix中使用通用脚本获得预期的输出?

第一个文件: "带引号的字符串内, ",(分隔符)带引号的字符串内

例子:

"20181115","12345643","This is a "test"","","657","This is a "TEST"","","aaaa"
"20181115","12345632","This is an "example" of the file, a "sample" aaaa","123","",""TEST"","",""

预期输出:

~20181115~;~12345643~;~This is a "test"~;~~;~657~;~This is a "TEST"~;~~;~aaaa~
~20181115~;~12345632~;~This is an "example" of the file, a "sample" aaaa~;~123~;~~;~"TEST"~;~~;~~

第二个文件:( |分隔符)带引号的字符串内和"字符串内的多个

例子:

"098789"|"Hello world!"| 12,7|"Cities I want to visit Rome| London"|15.11.2018|"Yes"
"032425"|"Travel in ""New York"", USA"| 113,3||15.11.2018|"Yes"

预期输出:

~098789~;~Hello world!~; 12,7;~Cities I want to visit Rome| London~;15.11.2018;~Yes~
~032425~;~Travel in /"New York/", USA~; 113,3;;15.11.2018;~Yes~

答案1

尝试用简单的sed替换来解决第一个问题:

sed 's/","/~;~/g; s/^"\|"$/~/g' file

以及awk第二个更复杂的脚本:

awk -F\" '{$1=$1; for (i=2; i<=NF; i+=2) gsub ("\|", SUBSEP, $i); gsub ("\|", ";"); gsub ("~~", "/\""); gsub (SUBSEP, "|")} 1' OFS="~" file 

它首先|用文本文件中不常见的不常见占位符替换双引号内的所有内容,然后进行所需的替换,然后反转占位符替换。

请注意,两者都是根据您的问题定制的,因此通常不适用于其他问题,即使是类似的问题,无需进行调整。

如果应用于问题中的示例,则输出(Ubuntu,mawk 1.3.3 Nov 1996):

~098789~;~Hello world!~; 12,7;~Cities I want to visit Rome| London~;15.11.2018;~Yes~
~032425~;~Travel in /"New York/", USA~; 113,3;;15.11.2018;~Yes~

相关内容