是否有一个脚本可以读取 TeX 文件并替换 \newcommand 的每个实例?

是否有一个脚本可以读取 TeX 文件并替换 \newcommand 的每个实例?

我想知道是否有一个脚本可以读取.tex文件并将非标准 TeX 命令的每个实例替换为其要替换的内容。我不确定我想要的是否清楚,但让我举一个例子:

假设输入是:

\documentclass{amsart} 
\usepackage{amsmath,amssymb}
\newcommand{\N}{\mathbb{N}}
\DeclareMathOperator{\End}{End}

\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's denote this ring by $\End(A)$. Throughout the lecture, $\N$ will denote
the set of natural numbers.
\end{document} 

然后,一个理想的输出此类脚本的格式为:

\documentclass{amsart}
\usepackage{amsmath, amssymb}
\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's denote this ring by $\operatorname{End}(A)$. Throughout the lecture,  
 $\mathbb{N}$ will denote the set of natural numbers.
\end{document}

附言:我想我曾经看到过类似的东西,但我既不记得地点,也不记得用什么关键词来启动谷歌。


我本来想写这个,这里的所有答案都很棒,但我把 2 数错了,算成了 4。:(

答案1

信息 我被迫TeX.sx 聊天室黑手党发布我可爱的、有缺陷的、可怕的、创伤性的、后世界末日的穷人的替代脚本的实现。:)

好吧,遗憾的是这不是 TeX 答案。:)这是我的拙见,用我不太擅长的脚本语言。

(我正在看着你,Python!)

import re
import sys

if len(sys.argv) != 3:
    print('We need two arguments.')
    sys.exit()

inputHandler = open(sys.argv[1], 'r')

mathDictionary = {}
commandDictionary = {}

print('Extracting commands...')
for line in inputHandler:
    mathOperator = re.search('\\\\DeclareMathOperator{\\\\([A-Za-z]*)}{(.*)}', line)
    if mathOperator:
        mathDictionary[mathOperator.group(1)] = mathOperator.group(2)
    newCommand = re.search('\\\\newcommand{\\\\([A-Za-z]*)}{(.*)}', line)
    if newCommand:
        commandDictionary[newCommand.group(1)] = newCommand.group(2)

inputHandler.seek(0)

print('Replacing occurrences...')
outputHandler = open(sys.argv[2],'w')
for line in inputHandler:
    current = line
    for x in mathDictionary:
        current = re.sub('\\\\DeclareMathOperator{\\\\' + x + '}{(.*)}', '', current)
        current = re.sub('\\\\' + x + '(?!\w)', '\\operatorname{' + mathDictionary[x] + '}', current)
    for x in commandDictionary:
        current = re.sub('\\\\newcommand{\\\\' + x + '}{(.*)}', '', current)
        current = re.sub('\\\\' + x + '(?!\w)', commandDictionary[x], current)
    outputHandler.write(current)

print('Done.')

inputHandler.close()
outputHandler.close()

现在,我们简单地称之为:

$ python myconverter.py input.tex output.tex
Extracting commands...
Replacing occurrences...
Done.

input.tex

\documentclass{amsart} 
\usepackage{amsmath,amssymb}
\newcommand{\N}{\mathbb{N}}
\DeclareMathOperator{\End}{End}

\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's denote this ring by $\End(A)$. Throughout the lecture, $\N$ will denote
the set of natural numbers.
\end{document} 

output.tex

\documentclass{amsart} 
\usepackage{amsmath,amssymb}



\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's denote this ring by $\operatorname{End}(A)$. Throughout the lecture, $\mathbb{N}$ will denote
the set of natural numbers.
\end{document} 

限制:

  • 这是我的代码,所以要小心!:)
  • 它只适用于\DeclareMathOperator{...}{...}\newcommand{...}{...}
  • 不支持任何可选参数\newcommand
  • 声明必须只有一行。
  • 请使用平衡的花括号。:)

我知道正则表达式不适合解析 TeX,但它们应该适用于非常简单的替换。

这是一篇关于正则表达式的精彩文章。 玩得开心。:)

答案2

我偶然发现了de-macro,这是一个用于此目的的 Python 脚本。它包含在 TeX Live 中。

限制:它仅影响\newcommand\renewcommand\newenvironment\renewenvironment不处理带星号的版本和可选参数。以下内容引自Willie Wong 对另一个问题的回答,并提供了有关限制的更多详细信息:

根据 Torbjørn T. 和 cfr 的建议,我更深入地研究了这个de-macro软件包。它在一定程度上是有效的。以下是注意事项:

  • 与文档建议的 不同,我安装的版本创建了数据库文件<filename>而不是 <filename>.db。但是显然它测试了<filename>.db作为定义数据库的名称。因此,在当前版本中,它将在每次运行时从头开始重新创建定义数据库。对于小文档,这不是问题。对于较大的文档,应该将数据库复制(而不是移动!)到 以<filename>.db利用任何潜在的加速。

  • 仍有一些错误需要解决。偶尔,它会通过}在代码中插入虚假内容来破坏前导码。我还没有找到原因或触发/MWE。我尝试过的小型测试用例在这方面都运行良好。

  • 非常重要:正如文档所示,所有要交换的定义必须位于以 结尾的单独包中 -private.sty。在主.tex文件中必须使用该包。

  • 同样重要的是:程序处理\newcommand\renewcommand,但不处理带星号的变体\newcommand*(尽管我认为这可以通过在 Python 代码中稍微修改正则表达式来解决)。这就是为什么我的第一次尝试失败了。(我总是使用自从我了解到这是最佳实践以来,加星标的版本

  • 同样重要的是:删除星号后,程序会抛出错误。我最终发现这是因为我习惯写成 \newcommand\cmdname{<replacement}\newcommand{\cmdname}{<replacement>}额外的括号对于解析很重要!

  • 最后,让我非常失望的是,这个节目无法处理可选参数。 \newcommand{\cmdname}[2]{blah #1 blah #2} 工作正常,但\newcommand{\cmdname}[2][nothing]{blah #1 blah #2} 引发异常。

我可以通过重写宏定义(您记得,无论如何,它将在最后被丢弃,这是本次练习的重点)来轻松修复/解决星号和括号的问题,不使用星号并添加额外的括号。

然而,可选参数处理的问题目前使该程序对我来说不太有用。我现在可以通过将可选和非可选命令拆分为两个单独的命令来解决这个问题。也许,如果我将来有时间,在弄清楚原始 python 脚本的逻辑后,我会考虑添加对它的支持。

答案3

这是perl执行相同任务的脚本。它与 Paulo 的代码有相同的限制,但在您的测试用例中运行良好。我相信它可以得到改进:)

你可以按以下方式使用它

perl replacenewcommands.plx myfile.tex

输出到终端,或者

perl replacenewcommands.plx myfile.tex > outputfile.tex

这将输出到outputfile.tex

替换新命令.plx

#!/usr/bin/perl

use strict;
use warnings;

# for newcommands
my @newcommandmacro=();
my %newcommandcontent=();

# for DeclareMathoperator
my @declaremathoperator=();
my %declaremathoperatorcontent=();

# for use as an index
my $macro;

# loop through the lines in the INPUT file
while(<>)
{
    # check for 
    #   \newcommand...
    # and make sure not to match
    #   %\newcommand
    # which is commented
    if($_ =~ m/\\newcommand{(.*)}{(.*)}/ and $_ !~ m/^%/)
    {
        push(@newcommandmacro,$1);
        $newcommandcontent{$1}=$2;

        # remove the \newcommand from the preamble
        s/\\newcommand.*//;
    }


    # loop through the newcommands in the 
    # main document
    foreach $macro (@newcommandmacro)
    {
      # make the substitution, making sure to escape the \
      # uinsg \Q and \E for begining and end respectively
      s/\Q$macro\E/$newcommandcontent{$macro}/g;
    }

    # check for 
    #   \DeclareMathOperator...
    # and make sure not to match
    #   %\DeclareMathOperator
    # which is commented
    if($_ =~ m/\\DeclareMathOperator{(.*)}{(.*)}/ and $_ !~ m/^%/)
    {
        push(@declaremathoperator,$1);
        $declaremathoperatorcontent{$1}=$2;

        # remove the \DeclareMathOperator from the preamble
        s/\\DeclareMathOperator.*//;
    }

    # loop through the DeclareMathOperators in the 
    # main document
    foreach $macro (@declaremathoperator)
    {
      # make the substitution, making sure to escape the \
      # uinsg \Q and \E for begining and end respectively
      s/\Q$macro\E(\(.*\))/\\operatorname{$declaremathoperatorcontent{$macro}}$1/g;
    }
    print $_;
}

在你的测试用例中

myfile.tex (原始文件)

\documentclass{amsart} 
\usepackage{amsmath,amssymb}
\newcommand{\N}{\mathbb{N}}
\newcommand{\mycommand}{something else}
\DeclareMathOperator{\End}{End}

\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's $\N$ denote this ring by $\End(A)$. Throughout the lecture, $\N$ will denote
the set of natural numbers. \mycommand

and \mycommand again
\end{document} 

outputfile.tex (新)

\documentclass{amsart} 
\usepackage{amsmath,amssymb}




\begin{document}
In this lecture we'll study the ring of Endomorphisms of an Abelian group $A$.
Let's $\mathbb{N}$ denote this ring by $\operatorname{End}(A)$. Throughout the lecture, $\mathbb{N}$ will denote
the set of natural numbers. something else

and something else again
\end{document} 

答案4

我写了一些 javascript 来扩展由\def\gdef\edef\xdef\newcommand\newcommand*\renewcommand和定义的宏。\renewcommand*你可以试试\DeclareMathOperator\DeclareMathOperator*这里

function expandMacros(tex) {
    function nestBrackets(level) {
        var level = level || 5, re = c = "(?:[^\\r\\n\\{\\}]|\\\\[\\{\\}]|\\r?\\n(?!\\r?\\n))*?";
        while (level--) re = c + "(?:\\{" + re + "\}" + c + ")*?";
        return " *(\\{" + re + "\\}|[^\\{])";
    }    
    function getRegExp(name, macro) {
        var num = macro.num, def = macro.def, re = "";
        while (num--) re += nestBrackets();
        re = "\\" + name + "(?![a-zA-Z\\}])" + re;
        return new RegExp(re, "g");
    }
    function trimString(s) {
        return s.replace(/^ +| +$/g, '').replace(/^\{|\}$/g, "");
    }
    function extractMacros() {
        var cs = "\\\\\\w+", re;
        // \def, \gdef, \edef and \xdef
        re = new RegExp("\\\\[gex]?def\\*? *(" + cs + ") *(#\\d)*" + nestBrackets(), "g");
        tex = tex.replace(re, function(match){
            var m = arguments;
            var macro = {
                num:  m[2] ? Math.min(m[2].length / 2, 9) : 0,
                def:  trimString(m[3])
            };
            macros[trimString(m[1])] = macro;
            return "";
        });
        // \newcommand, \newcommand*, \renewcommand and \renewcommand*
        re = new RegExp("\\\\(?:re)?newcommand\\*? *(" + cs + "|\\{" + cs + "\}) *(\\[(\\d)\\])?"
                        + nestBrackets(), "g");
        tex = tex.replace(re, function(match){
            var m = arguments;
            var macro = {
                num:  m[3] || 0,
                def:  trimString(m[4])
            };
            macros[trimString(m[1])] = macro;
            return "";
        });
        // \DeclareMathOperator and \DeclareMathOperator* inside amsmath
        re = new RegExp("\\\\DeclareMathOperator(\\*?) *(" + cs + "|\\{" + cs + "\}) *"
                        + nestBrackets(), "g");
        tex = tex.replace(re, function(match){
            var m = arguments;
            var macro = {
                num:  0,
                def:  "\\operatorname" + m[1] + "{" + trimString(m[3]) + "}"
            };
            macros[trimString(m[2])] = macro;
            return "";
        });
    }
    function replaceMacros() {
        var i = 0, m, re, num;
        for (name in macros) {
            m = macros[name];
            re = getRegExp(name, m), num = m.num;
            //console.log(re);
            tex = tex.replace(re, function(match){
                //console.log(arguments);
                var args = [], result = m.def, k;
                for (k = 1; k <= num; k++) {
                    args[k] = trimString(arguments[k]);
                }
                //console.log(args);
                for (k = 1; k <= num; k++) {
                    result = result.replace(new RegExp("#" + k, "g"), args[k]);
                }
                return result;
            });
        }
    }
    var macros = {};
    extractMacros();
    replaceMacros();
    return tex;
}

document.getElementById("run").onclick = function() {
    var input = document.getElementById("input"),
        output = document.getElementById("output");
    output.value = expandMacros(input.value);
}

相关内容