将直引号更改为智能引号,而不牺牲自动换行

将直引号更改为智能引号,而不牺牲自动换行

如何将直引号更改为智能弯引号而不省略自动换行。

该示例包含单直引号和双直引号。

输入

1 I have bowed before only one sanyasi in my life, and that is 'Sri
2 Chandrasekharendra Saraswathi', known to the world as the "Parmacharya."
3 
4 Therefore, I was the ''modern 
5 Indian'',believer in science, and
6 with little concern for spiritual
7 diversions.

输出

1 I have bowed before only one sanyasi in my life, and that is ‘Sri
2 Chandrasekharendra Saraswathi’, known to the world as the “Parmacharya.”
3 
4 Therefore, I was the “modern 
5 Indian”,believer in science, and with
6 little concern for spiritual 
7 diversions.

答案1

为了使换行符不再成为问题,我们可以进行替换,以便将整个段落或整个文件作为一个字符串进行处理。使用 Perl,我们可以-0777一次性读取整个文件,或者-00使用段落模式(即用空行分隔的部分,这当然要求行号不是输入文件的一部分):

$ perl -0777 -pe 's/\x27\x27/"/g; s/\x27(.*?)\x27/‘$1’/gs; s/"(.*?)"/“$1”/gs; ' input
I have bowed before only one sanyasi in my life, and that is ‘Sri
Chandrasekharendra Saraswathi’, known to the world as the “Parmacharya.”

Therefore, I was the “modern 
Indian”, believer in science, and
with little concern for spiritual
diversions.

\x27我使用单引号的十六进制表示来使引用更容易。.*?表示任何字符串,但尽可能短的匹配。第一条规则将双单引号更改''为双引号。

或者,与 GNU sed 类似,-z将输入作为 NUL 分隔的字符串,因此将一次性读取通常的文本文件:

$ sed -zEe 's/\x27\x27/"/g; s/\x27([^\x27]*)\x27/‘\1’/g; s/"([^"]*)"/“\1”/g; ' input
I have bowed before only one sanyasi in my life, and that is ‘Sri
Chandrasekharendra Saraswathi’, known to the world as the “Parmacharya.”

Therefore, I was the “modern 
Indian”, believer in science, and
with little concern for spiritual
diversions.

答案2

我找到了一个简单的解决方案。它的使用方式pandoc如下,其中选项-S将直引号更改为弯引号:

pandoc --wrap=preserve -f markdown -t markdown -S <filename>

取自一条评论经过罗摩普拉卡沙

答案3

一个简单的实现,完全依赖于 Perl 的单词字符类。仅将 ["] 更改为 [„] 或 [”]。

#!/usr/bin/perl -w -0777
local $/ = undef;

open INFILE, $ARGV[0] or die "I can't read the file. $!";
$string =  <INFILE>;
close INFILE;

$string =~ s/(\w)\"/$1”/smg;
$string =~ s/\"(\w)/„$1/smg;

open OUTFILE, ">",   $ARGV[1] or die "I can't write to the file. $!";
print OUTFILE ($string);

close 

另存为script.pl并运行perl script.pl INFILE OUTFILE。之后,您只需搜索任何剩余的错误放置的直引号,例如 |aaaaaa"bbbbb| 或 |aaaa " bbbb|希望这不是很频繁。

相关内容