替换文件中的引号

Question 1

GNU awk使用and 的一种方式FPAT：

awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) if (substr($i,0,1) == "\"" && substr($i,length($i),1) == "\"") { gsub(/"/, "", $i); printf "\"%s\"\n", $i } else { gsub(/"/, "", $i); print $i } }'

测试：

echo '"This";"is";1;"line" of" data";""with";"extra quotes""' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) if (substr($i,0,1) == "\"" && substr($i,length($i),1) == "\"") { gsub(/"/, "", $i); printf "\"%s\"\n", $i } else { gsub(/"/, "", $i); print $i } }'

结果：

"This"
"is"
1
"line of data"
"with"
"extra quotes"

一种方法是使用GNU awk和：FPATGNU sed

sed -e '/^".*"$/ { s/"//g; s/.*/"&"/ }' -e '/^".*"$/!s/"//g'

测试：

echo '"This";"is";1;"line" of" data";""with";"extra quotes""' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) print $i }' | sed -e '/^".*"$/ { s/"//g; s/.*/"&"/ }' -e '/^".*"$/!s/"//g'

结果：

"This"
"is"
1
"line of data"
"with"
"extra quotes"

Answer

GNU awk使用and 的一种方式FPAT：

awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) if (substr($i,0,1) == "\"" && substr($i,length($i),1) == "\"") { gsub(/"/, "", $i); printf "\"%s\"\n", $i } else { gsub(/"/, "", $i); print $i } }'

测试：

echo '"This";"is";1;"line" of" data";""with";"extra quotes""' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) if (substr($i,0,1) == "\"" && substr($i,length($i),1) == "\"") { gsub(/"/, "", $i); printf "\"%s\"\n", $i } else { gsub(/"/, "", $i); print $i } }'

结果：

"This"
"is"
1
"line of data"
"with"
"extra quotes"

一种方法是使用GNU awk和：FPATGNU sed

sed -e '/^".*"$/ { s/"//g; s/.*/"&"/ }' -e '/^".*"$/!s/"//g'

测试：

echo '"This";"is";1;"line" of" data";""with";"extra quotes""' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")" } { for (i=1; i<=NF; i++) print $i }' | sed -e '/^".*"$/ { s/"//g; s/.*/"&"/ }' -e '/^".*"$/!s/"//g'

结果：

"This"
"is"
1
"line of data"
"with"
"extra quotes"

Question 2

我宁愿使用 coreutils 和 sed (GNU 版本)：

<<< '"This";"is";1;"line" of" data";""with";"extra quotes""' \
| tr ';' '\n' | sed -r 's/(.)"(.)/\1\2/g' | tr '\n' ';'

输出：

"This";"is";1;"line of data";"with";"extra quotes";

它留下一个额外的分号，并删除换行符，head -c -1在第二个分号之前插入并附tr加以; echo修复：

tr ';' '\n' | sed -r 's/(.)"(.)/\1\2/g' | head -c -1 | tr '\n' ';'; echo

输出：

"This";"is";1;"line of data";"with";"extra quotes"

Answer

我宁愿使用 coreutils 和 sed (GNU 版本)：

<<< '"This";"is";1;"line" of" data";""with";"extra quotes""' \
| tr ';' '\n' | sed -r 's/(.)"(.)/\1\2/g' | tr '\n' ';'

输出：

"This";"is";1;"line of data";"with";"extra quotes";

它留下一个额外的分号，并删除换行符，head -c -1在第二个分号之前插入并附tr加以; echo修复：

tr ';' '\n' | sed -r 's/(.)"(.)/\1\2/g' | head -c -1 | tr '\n' ';'; echo

输出：

"This";"is";1;"line of data";"with";"extra quotes"

Question 3

我自己的解决方案是仅使用sed，去掉所有不与分隔符或数字字段并列的分号（该awk命令只是为了说明清楚）：

echo '"This";"is";1;"line" of" data";""without";"extra quotes""' | sed -E 's/([^;])"+([^;])/\1\2/g' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")"}; {for ( i=1 ; i<=NF ; i++ ) print $i}'
"This"
"is"
1
"line of data"
"without"
"extra quotes"

我认为它更快，因为它在整行上工作，而不是按字段分割行。

Answer

我自己的解决方案是仅使用sed，去掉所有不与分隔符或数字字段并列的分号（该awk命令只是为了说明清楚）：

echo '"This";"is";1;"line" of" data";""without";"extra quotes""' | sed -E 's/([^;])"+([^;])/\1\2/g' | awk 'BEGIN { FPAT = "([^;]+)|(\"[^\"]+\")"}; {for ( i=1 ; i<=NF ; i++ ) print $i}'
"This"
"is"
1
"line of data"
"without"
"extra quotes"

我认为它更快，因为它在整行上工作，而不是按字段分割行。

替换文件中的引号

答案1

答案2

答案3

相关内容