我使用 Mac,OSX 10.11.5。当我将文本从 pdf 文件粘贴到 emacs(tex 模式)时,我得到的是“∈”、“Ω”等字符。我需要将它们转换为 LaTex 等效字符“\in”、“\Omega”。有什么提示吗?
答案1
使用各种编辑的建议(非常感谢!)我实施了这个解决方案,这对我来说没问题
#!/usr/bin/pythonw
# -*- coding: utf-8 -*-
import sys
import re
# NOTE: literals enclosing backckslash are forced to raw using prefix r'...'
repldict = {'Ω':r'\Omega','?<8A><86>':r'\subseteq','?<8A><82>':r'\subset',
'?<9F>?':'<','?<9F>?':'>',
'?<88><88>':r'\in','?<97>':r'\times','?<80><99>':'*apostrofo*',
'μ':r'\mu','λ':r'\lambda','?<86>':r'\phi',
'?<86><92>':r'\rightarrow','·':r'\cdot','?<88>?':'||',
'?<89>?':r'\le',
'?<88><9E>':r'\infty','ε':r'\varepsilon','Φ':r'\Phi',
'?<88><92>':r'-','?<80><9C>':r'``','?<80><9D>':r'"','?<80><94>':r'-'}
def replfunc(match):
return repldict[match.group(0)]
def main():
regex = re.compile('|'.join(re.escape(x) for x in repldict))
inFile = sys.argv[1]
fin = open(inFile,'r')
outFile = 'pdf2latexChars' + '.tex'
fout = open(outFile, 'w')
print 'inFile=' + inFile + '; outFile=' + outFile
for line in fin:
fout.write(regex.sub(replfunc,line))
if __name__ == '__main__':
main()
答案2
我刚刚在 PHP 中完成了此操作。您应该能够在 Python 中执行类似以下操作:
$string = "I want the Latex equivalent of Ω, Δ, etc.";
$string = str_replace(
array("Ω","Δ"),
array("$\Omega$", "$\Delta$"),
$string);
字符串变量现在是:“我想要 $\Omega$、$\Delta$ 等的 Latex 等效项。”