如何在宏定义中保留实际输入的换行符/行尾符(十六进制 0x0a)

如何在宏定义中保留实际输入的换行符/行尾符(十六进制 0x0a)

我正在使用filecontents包,并希望将其包含在宏中,以便分解一些复杂的处理。问题是,后面\begin{filecontents}除了空格和实际的尾随换行符外什么都不用跟,不幸的是,TeX/LaTex 处理宏定义的方式吸收了换行符。这是 M(not)WE :


% next-line-compulsorily-ends-only-with-spaces-then-newline



Equivalent to:



\begin{filecontents}由于吸收了行尾不存在的换行符\newcommand{\cf},因此 TeX/LaTeX 不会理解此命令并将其余部分处理为要输出的文本,这会导致错误Missing \begin{document}






% The integer-parameter \newlinechar usually has the value 10.
% This means if at the time of writing to text-file (La)TeX encounters a
% character whose code-point's number in (La)TeX's internal character-encoding
% (which either is ASCII or is UTF-8)  is 10, it will not write that character
% but will write a linebreak. 10 denotes the Line Feed-character and using
% (La)TeX's ^^-notation you can also write it as ^^J.


\cf{suffix1}{ somefirststuff\LaTeX}%

factored-complex-processing-of{ somenextstuff\LaTeX}


\noindent prefix-suffix1.tex is:



\noindent prefix-suffix2.tex is:



Do you see the subtle difference?

In prefix-suffix1.tex there is a space behind the phrase \verb|\LaTeX|.

In prefix-suffix2.tex there is no space behind the phrase \verb|\LaTeX|.

The reason is:

prefix-suffix1.tex comes from the \verb|\cf|-command.

The tokens forming the second argument of \verb|\cf| were
tokenized under normal catcode-r\'egime. Thus the phrase \verb|\LaTeX| got
tokenized as control-word-token. When \verb|\scantokens| did its simulated
unexpanded-writing-part, that control-word-token got written unexpanded
with a trailing space as \LaTeX{} always writes control-word-tokens with
a trailing space.\\
Besides this, hashes of catcode 6 will be doubled at writing time also.

If you don't like such side-effects, you need to have  \LaTeX{} read the
second argument of \verb|\cf| under verbatim-catcode-r\'egime. How to do
that is exhibited in the other example.


\noindent\verb|\input{prefix-suffix1.tex}| yields:



\noindent\verb|\input{prefix-suffix2.tex}| yields:




请注意\scantokens{<stuff>},由 e-TeX 扩展提供的 类似于

\immediate\openout\myscratchwrite temp.tex\relax
\input temp.tex %\@@@input with LaTeX 2e.



\expandafter\def\expandafter\temp\scantokens{{definition text}}


\immediate\openout\myscratchwrite temp.tex\relax
\myscratchtoks{{definition text}}%
\expandafter\def\expandafter\temp\input temp.tex %\@@@input with LaTeX 2e.


请注意,当发生标记的未扩展写入时,catcode 6 的字符标记将加倍,即,#catcode 6 的哈希值将加倍。


如果您不想要这些效果,您需要转变\cf为一个在 verbatimized-catcode-conditions 下读取其内容的宏。如果这样做,\cf则不得通过其他宏获取其提供的参数,而必须通过从 tex-input-file 读取和标记来获取它们。即,与 -command 相同的限制适用\verb



%%<-------------------- Code for \UDcollectverbarg -------------------->
%% Check whether argument is empty:
%% \UD@CheckWhetherNull{<Argument which is to be checked>}%
%%                     {<Tokens to be delivered in case that argument
%%                       which is to be checked is empty>}%
%%                     {<Tokens to be delivered in case that argument
%%                       which is to be checked is not empty>}%
%% The gist of this macro comes from Robert R. Schneck's \ifempty-macro:
%% <https://groups.google.com/forum/#!original/comp.text.tex/kuOEIQIrElc/lUg37FmhA74J>
  \UD@secondoftwo\string}\expandafter\expandafter\UD@firstoftwo{ }{}%
  \UD@secondoftwo}{\expandafter\expandafter\UD@firstoftwo{ }{}\UD@firstoftwo}%
\catcode`\^^M=12 %
    { #5{#4#2}}{\@UDEndlreplace{#1}#3\relax{#4#2#1}{#5}}%
  \let\do\@makeother % <- this and the next line switch to
  \dospecials        %    verbatim-category-code-régime.
  \catcode`\{=1      % <- give opening curly brace the usual catcode so a 
                     %    curly-brace-balanced argument can be gathered in
                     %    case of the first thing of the verbatimized-argument 
                     %    being a curly opening brace.
  \catcode`\ =10     % <- give space and horizontal tab the usual catcode so \UD@collectverbarg
  \catcode`\^^I=10   %    cannot catch a space or a horizontal tab as its 4th undelimited argument.
                     %    (Its 4th undelimited argument denotes the verbatim-
                     %     syntax-delimiter in case of not gathering a
                     %     curly-brace-nested argument.)
  {% seems a curly-brace-nested argument is to be caught:
    \catcode`\}=2    % <- give closing curly brace the usual catcode also.
  }{% seems an argument with verbatim-syntax-delimiter is to be caught:
    \do\{% <- give opening curly brace the verbatim-catcode again.
  \do\ %   <- Now that \UD@collectverbarg has the delimiter or
  \do\^^I%    emptiness in its 4th arg, give space and horizontal tab
         %    the verbatim-catcode again.
  \do\^^M% <- Give the carriage-return-character the verbatim-catcode.
    \@onelevel@sanitize\@tempb % <- Turn characters into their "12/other"-pendants.
                               %    This may be important with things like the 
                               %    inputenc-package which may make characters 
                               %    active/which give them catcode 13(active).
    \expandafter\UDEndlreplace\expandafter{\@tempb}{#1}{\def\@tempb}% <- this starts 
                               %    the loop for replacing endline-characters.
    \expandafter\UD@@collectverbarg\expandafter{\@tempb}{#2}{#3}% <- this "spits 
                               %    out the result.
%%<---------------- End of code for \UDcollectverbarg ----------------->

% The integer-parameter \newlinechar usually has the value 10.
% This means if at the time of writing to text-file (La)TeX encounters a
% character whose code-point's number in (La)TeX's internal character-encoding
% (which either is ASCII or is UTF-8)  is 10, it will not write that character
% but will write a linebreak. 10 denotes the Line Feed-character and using
% (La)TeX's ^^-notation you can also write it as ^^J.

\cf{suffix1}{ somefirststuff\string#\LaTeX}%

factored-complex-processing-of{ somenextstuff\string#\LaTeX}


\noindent prefix-suffix1.tex is:



\noindent prefix-suffix2.tex is:



\noindent\verb|\input{prefix-suffix1.tex}| yields:



\noindent\verb|\input{prefix-suffix2.tex}| yields:







% The integer-parameter \newlinechar usually has the value 10.
% This means if at the time of writing to text-file (La)TeX encounters a
% character whose code-point's number in (La)TeX's internal character-encoding
% (which either is ASCII or is UTF-8)  is 10, it will not write that character
% but will write a linebreak. 10 denotes the Line Feed-character and using
% (La)TeX's ^^-notation you can also write it as ^^J.


      \romannumeral0\expandafter\exchange\expandafter{\expandafter{\UD@tempa}}{ %

\newcommand\thirdstage{Expanded. \LaTeX}

\cf{suffix1}{ somefirststuff \firststage}%

factored-complex-processing-of{ somenextstuff \firststage}


\noindent prefix-suffix1.tex is:



\noindent prefix-suffix2.tex is:



\noindent\verb|\input{prefix-suffix1.tex}| yields:



\noindent\verb|\input{prefix-suffix2.tex}| yields:



