从 etex.src 转储 etex 时如何避免“Chi UTF-8 错误”?

从 etex.src 转储 etex 时如何避免“Chi UTF-8 错误”?

“气错”:

! Text line contains an invalid character.
l.18   \testengine ^^?
                      ^^?!\relax % That's Chi, a 2-byte UTF-8 sequence'

错误由以下原因触发:

18  \testengine χ!\relax % That's Chi, a 2-byte UTF-8 sequence

dehypht-x-2019-04-04.tex(Windows 10 TexLive 2019 发行版)中。

eTeX 是否需要先学习 UTF-8 才能处理此文件?如何操作?

我刚刚遇到的问题是 etex.src 有时包含制表符而不是空格。

我是否有其他文件问题dehypht-x-2019-04-04.tex或者是 eTeX 设置问题?

我使用以下方式生成了 eTeXweb2js 将 TeX 的 pascal 转换为 javascript

 fmt_file = "*etex.src"
 library.setInput("\n" + fmt_file + " \\dump\n\n", 

正如您在以下日志中看到的,“Chi”问题始于该dehypht-x-2019-04-04.tex文件:

This is e-TeX, Version 3.14159265-2.6 (INITEX)
**entering extended mode
(etex.src (plain.tex Preloading the plain format: codes, registers,
parameters, fonts, more fonts, macros, math definitions, output routines,
hyphenation (hyphen.tex [skipping from \patterns to end-of-file...]))
(etexdefs.lib Skipping module "grouptypes"; Loading module "interactionmodes";
Skipping module "nodetypes"; Skipping module "iftypes";) (language.def
(hyphen.tex) (dehypht-x-2019-04-04.tex
! Text line contains an invalid character.
l.18   \testengine ^^?
                      ^^?!\relax % That's Chi, a 2-byte UTF-8 sequence
? ! Text line contains an invalid character.
l.18   \testengine ^^?^^?
                         !\relax % That's Chi, a 2-byte UTF-8 sequence
? Runaway argument?
\relax \ifx \secondarg \empty \message {dehyph-exptl: using a \ETC.
! Paragraph ended before \testengine was complete.
<to be read again>
                   \par
l.132
.
.    
.
\relax \ifx \secondarg \empty \message {UTF-8 Hyphenation patt\ETC.
! File ended while scanning use of \testengine.
<inserted text>
                \par
\addlanguage ...uselanguage {#1}\input #2
                                          \if *#3*\else \input #3 \fi...
l.73 ...ntgreek}{loadhyph-grc.tex}{}{1}{1}

? (ibyhyph.tex Greek hyphenation patterns for Ibycus encoding, v3.0
! TeX capacity exceeded, sorry [pattern memory=8000].
l.615 a)2n1a'gku

答案1

这个答案基于 David Carlisle、Joseph Wright 和 Marcel Krüger 的想法、提示和建议。他们已经解决了这个问题,我只是把它写在这里。

这是一个字符编码问题。tex.web每个字节读取一个字符。然而,χ(Chi)由2个字节组成:\xcf\x87,即unicode代码点U+03C7。

如何教导tex.web阅读UTF-8输入:

tex.sys

%add the bigger codepage:

@x
for i:=0 to @'37 do xchr[i]:=' ';
for i:=@'177 to @'377 do xchr[i]:=' ';
@y
for i:=0 to @'37 do xchr[i]:=chr(i);
for i:=@'177 to @'377 do xchr[i]:=chr(i);
@z

在这里你可以找到 Marcel Krüger 关于如何在系统文件中使用这 7 行代码的更详细说明:如何准备一个 chg 文件,将 tex.web tex.ch etex.ch 和 etex.sys 合并为一个新的 etex.web?

通过此修改,您可以生成一个可以处理的系统χ

chi_test.tex

\def\zz#1#2{[#1][#2]}
\immediate\write20{\zzχ..}
\end

output

This is TeX, Version 3.14159265 (preloaded format=plain 1776.7.4)
**(chi_test.tex
[^^cf][^^87]..
 )
No pages of output.
Transcript written on chi_test.log.

相关内容