无法编译 latex3 正则表达式

无法编译 latex3 正则表达式

我刚刚编写了一个涉及简单正则表达式的代码,但它比字符串搜索慢得多(至少慢 100 倍)。因此,为了优化它,我尝试编译正则表达式,不幸的是它给出了编译错误:

Braced quantifier '{' may not be followed by '_'

知道我做错了什么吗?

梅威瑟:

\documentclass[]{article}
\begin{document}
\ExplSyntaxOn
\cs_generate_variant:Nn \regex_extract_all:nnN { VVN, nVN }
\str_new:N \l__robExt_my_str
\str_set:Nn \l__robExt_my_str {I like \vegetable and \fruit.}

% Without compilation, this is fine
\regex_extract_all:nVN { \\[A-Za-z]+ } \l__robExt_my_str \l__robExt_output_seq
\show\l__robExt_output_seq

% We compile the regex: we get an error
\regex_const:Nn \l__robExt_macro_regex { \\[A-Za-z]+ }
\regex_extract_all:VVN \l__robExt_macro_regex \l__robExt_my_str \l__robExt_output_seq
\show\l__robExt_output_seq

\ExplSyntaxOff
\end{document}

答案1

我找到了解决方案,我需要NVNNnN和 而不是生成变体nnN。它可能快了 25%,但仍然比我想象的要慢得多。如果有人知道为什么正则表达式很慢,我很乐意听听。

\documentclass[]{article}

\begin{document}

\ExplSyntaxOn
\cs_generate_variant:Nn \regex_extract_all:nnN { VVN, nVN }
\cs_generate_variant:Nn \regex_extract_all:NnN { NVN }

\str_new:N \l__robExt_my_str
\str_set:Nn \l__robExt_my_str {I like \vegetable and \fruit.}

% Without compilation, this is fine

\regex_extract_all:nVN { \\[A-Za-z]+ } \l__robExt_my_str \l__robExt_output_seq
\show\l__robExt_output_seq

% We compile the regex: we get an error
\regex_const:Nn \l__robExt_macro_regex { \\[A-Za-z]+ }
\regex_show:N \l__robExt_macro_regex
\regex_extract_all:NVN \l__robExt_macro_regex \l__robExt_my_str \l__robExt_output_seq
\show\l__robExt_output_seq

\ExplSyntaxOff

\end{document}

编辑 我不知道为什么,但在我的测试中,正则表达式搜索似乎\regex_match:nVTF\regex_extract_all:NVN比慢得多\str_if_in:NnTF。这是我的基准测试文件:

\documentclass{article}

\usepackage{amsmath}
\usepackage{forest}
% grab latest .sty file from https://github.com/leo-colisson/robust-externalize/
\usepackage{robust-externalize}
\robExtConfigure{
  % If you do not want to enable shell-escape, just manually
  % inspect benchmark-robExt-compile-missing-figures.sh
  % and run "bash benchmark-robExt-compile-missing-figures.sh"
  enable fallback to manual mode,
  compile in parallel after=3,
}

\cacheEnvironment{forest}{
  latex,
  add to preamble={
    \usepackage{forest}
  },
  % Uncomment one line at a time to compare efficiency:
  % (ordered by faster -> slower)
  % 2.86s (adding one "if matches" seems to add around 0.01s, try to add dummy matches to test)
  %if matches={mainName}{forward=\mainName},
  % 3.37s (adding one "if matches" seems to add around 1s, if you uncomment the last the compilation will go to 14s)
  %if matches regex={mainName}{forward=\mainName},
  % 5.21s ==> that's the time I'd like to optimize the most.
  auto forward,
  %%% Just to see that matches takes a really different time from matches regex:
  % Adds ~10s!
  % if matches regex={xxxxA}{},
  % if matches regex={xxxxB}{},
  % if matches regex={xxxxC}{},
  % if matches regex={xxxxD}{},
  % if matches regex={xxxxE}{},
  % if matches regex={xxxxF}{},
  % if matches regex={xxxxG}{},
  % if matches regex={xxxxH}{},
  % if matches regex={xxxxI}{},
  % if matches regex={xxxxJ}{},
}


\NewDocumentCommandAutoForward{\mainName}{}{John}

\begin{document}
\foreach \j in {0,...,200}{
  % This is always the same picture, this mostly drastically reduces the first compilation time
  % without significantly changing the next runs, which is what we try to optimize right now
  \begin{forest}
    [\mainName
    [\mainName [\mainName]]
    [\mainName
    [\mainName [\mainName]]
    [\mainName[\mainName]]
    [\mainName[D[a]][NP[\mainName]]]
    ]
    ]
  \end{forest}\\
}

\end{document}

相关内容