如何使用 Jim Fowler 的 WEB/TeX pascal 到 WASM 编译器 web2js 制作 e-TeX WebAssembly?

如何使用 Jim Fowler 的 WEB/TeX pascal 到 WASM 编译器 web2js 制作 e-TeX WebAssembly?

我在 Windows 10 上安装了 TeX Live 2019 发行版,并想在网络浏览器的 WebAssembly 下运行基于 e-TeX 的预加载 LaTeX(其中包括:计算器、微积分、TikZ、CircuiTikZ)。

对于我找到的工作蒂克扎克斯,其工作原理如下(引用自 Jim Fowler kisonecat/tikzjax 的 readme.md):

这是如何运作的?

使用https://github.com/kisonecat/web2js将 tex 的 Pascal 源代码编译为 WebAssembly;加载 latex 格式(不包含所有连字符数据),并且

\documentclass[margin=0pt]{standalone}
\def\pgfsysdriver{pgfsys-ximera.def}
\usepackage{tikz}

执行。然后转储核心;压缩生成的核心,通过在浏览器中重新加载转储的核心,可以非常快速地到达可以执行 TikZ 的地步。通过使用 PGF 的 SVG 驱动程序以及https://github.com/kisonecat/dvi2html DVI 输出转换为 SVG。

所有这些都发生在浏览器中。

我按照以下说明执行了以下步骤Web2JS

  1. 下载 TeX WEB 源的干净副本;输出tex.web
  2. 通过 tangle -ing 生成 Pascal 源代码,但使用以下更改版本:tangle -underline tex.web etex.ch感谢 ShreevatsaR 的提示;输出tex.p tex.pool从 etex 重命名为 tex 后
  3. 编译tex.p“以获取 WebAssembly 二进制文件;输出out.wasm
  4. plain.fmt使用名为 initex.js 的 JavaScript生成并进行相应的内存转储;输入out.wasmplain.tex输出core.dump,,,,plain.fmtplain.logtexput.log
  5. 编译sample.tex输入core.dump输出sample.dvisample.log

我不知道如何etex.ch正确包含用 Pascal 构建的 eTeX(并在 WebAssembly 中运行)的所有更改。

我无法使用 web2js 进行编译tex.p(实际上是etex.p)来获取 WebAssembly 二进制文件out.wasm

我了解到etex.ch缺少一些变化,例如内存管理。

这是编译尝试中出现的错误:

c:\texlive\eTeX\web2js\node_modules\binaryen\index.js:7
if(t){v=__dirname+"/";var ba,ca;a.read=function(c,e){var g=w(c);
g||(ba||(ba=require("fs")),ca||(ca=require("path")),
c=ca.normalize(c),g=ba.readFileSync(c)); return e?g:g.toString()};
a.readBinary=function(c){c=a.read(c,!0);c.buffer||
(c=new Uint8Array(c));assert(c.buffer);return c};1<process.argv.length&&
(a.thisProgram=process.argv[1].replace(/\\/g,"/"));
a.arguments=process.argv.slice(2);
process.on("uncaughtException",function(c){if(!(c instanceof x))
throw c;});process.on("unhandledRejection",y);a.quit=
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ^
Need 32906 of memory

为了避免这个错误,我在tex.web纠结之前更改了以下常量:

  • 最大字符串数 3000 -> 500000
  • string_vacancies 8000 -> 90000
  • 池大小 32000 -> 6250000
  • 最大半字 65535 -> 268435455
  • 内存最大值 30000 -> 268435455
  • 缓冲区大小 500 -> 200000
  • 堆栈大小 200 -> 5000
  • 内存最高 3000 -> 268435455

如果没有这些更改,我会收到此错误:

! You have to increase POOLSIZE.


我该如何做才etex.ch正确?


更新 04.08.2019


感谢 Marcel Krüger,etex.sys现在我可以毫无问题地创建一个简单的 e-TeX。

关于这个非常有价值的答案的注释:
- WebAssembly 内存页面大小:64 千字节即 65536 字节 [1]
- WebAssembly 内存实现限制:2GB(截至今日)=>32767页 [2]


1 允许提供比模块指定的更多的初始内存 #540
2 无法将 TOTAL_MEMORY 设置为大于 2Gb,或将内存扩展到大于 2Gb

答案1

增加池大小会导致额外的内存需求。因此,您不需要对 eTeX 进行任何其他更改,只需增加提供的内存即可。在您的 Javascript 版本中,内存量是在“编译器”中设置的。对于您的设置,您需要32906内存页面,但页面的实现限制为32767。幸运的是,您可以通过使用较小的值来避免此问题。

因此我们需要更改 中的一些常量形式etex.web。这并不意味着您的代码etex.ch是“错误的”,而您需要一个“正确的”代码。实际上 的许可证etex.ch会禁止此类修改(至少在不更改名称的情况下)。相反,您应该编写一个系统相关etex.sys文件,以便tangle稍后将其传递给它。

因此首先从tex.web和获取副本etex.ch,然后运行

tie -m etex.web tex.web etex.ch

来获取etex.web。现在您需要一个包含新常量的变更文件,例如将以下内容保存为etex.sys

eTeX compatible constants for web2js

@x
@<Constants...@>=
@!mem_max=30000; {greatest index in \TeX's internal |mem| array;
  must be strictly less than |max_halfword|;
  must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
  must be |min_halfword| or more;
  must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=500; {maximum number of characters simultaneously present in
  current lines of open files and in control sequences between
  \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
  error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=200; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
  can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
  and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=3000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=8000; {the minimum number of characters that should be
  available for the user's control sequences and font names,
  after \TeX's own error messages are stored}
@!pool_size=32000; {maximum number of characters in strings, including all
  error messages and help texts, and the names of all fonts and
  control sequences; must exceed |string_vacancies| by the total
  length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
  at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
  \.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL                     ';
  {string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>

@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.

@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
  must not be less than |mem_min|}
@d mem_top==30000 {largest index in the |mem| array dumped by \.{INITEX};
  must be substantially larger than |mem_bot|
  and not greater than |mem_max|}
@y
@<Constants...@>=
@!mem_max=200000; {greatest index in \TeX's internal |mem| array;
  must be strictly less than |max_halfword|;
  must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
  must be |min_halfword| or more;
  must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=5000; {maximum number of characters simultaneously present in
  current lines of open files and in control sequences between
  \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
  error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=1000; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
  can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
  and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=60000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=300000; {the minimum number of characters that should be
  available for the user's control sequences and font names,
  after \TeX's own error messages are stored}
@!pool_size=350000; {maximum number of characters in strings, including all
  error messages and help texts, and the names of all fonts and
  control sequences; must exceed |string_vacancies| by the total
  length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
  at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
  \.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL                     ';
  {string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>

@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.

@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
  must not be less than |mem_min|}
@d mem_top==200000 {largest index in the |mem| array dumped by \.{INITEX};
  must be substantially larger than |mem_bot|
  and not greater than |mem_max|}
@z

@x
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==65535 {largest allowable value in a |halfword|}
@y
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==16777215 {largest allowable value in a |halfword|}
@z

现在你可以运行tangle

tangle -underline etex.web etex.sys

您将获得文件etex.petex.pool

当然web2js仍会寻找tex.pool,但你可以改变

filename = "tex.pool";

进入

filename = "etex.pool";

在 和header.jslibrary.js

现在让我们尝试一下

node compile.js etex.p

与您最初的实验类似,我们得到

[...]

Need 41 of memory

现在41明显小于32906,尤其是低于32767。所以我们可以分配更多内存。这需要在四个文件中一致地完成:在index.jsinitex.jstex.jspascal/program.js,更改

var pages = 20;

进入

var pages = 50;

(可能 41 就足够了,但 50 看起来更好)

现在我们可以尝试

node compile.js etex.p

再次。这次它真的有效了!你node initex.js现在可以使用来获取纯 TeX 格式,但我们实际上想要 eTeX。因此,你可以获得一个版本etex.srcetexdefs.liblanguage.def更改

library.setInput("\nplain \\dump\n\n"

进入initex.js

library.setInput("\n*etex \\dump\n\n"

这里,星号*很重要,它启用“扩展模式”。在同一个文件中也更改为预&plain加载。&etexetex

然后

node initex.js

生成e-TeX格式etex.fmt和内存转储,可用于

node tex.js

答案2

我设法获得了可兼容的 LaTeX 格式web2js,但有一些注意事项。

这是一个 (对我来说) 可行的步骤序列。

  1. 获取 web2js:下载压缩文件并解压缩,或者运行

    git clone https://github.com/kisonecat/web2js.git
    
  2. 获取tex.web:使用浏览器下载,或运行:

    wget http://mirrors.ctan.org/systems/knuth/dist/tex/tex.web
    
  3. 获取etex.ch:使用浏览器下载,或运行:

    wget -O etex.ch 'https://tug.org/svn/texlive/trunk/Build/source/texk/web2c/etexdir/etex.ch?revision=32727&view=co'
    
  4. 把它们联系在一起:

    tie -m mytex.web tex.web etex.ch
    
  5. 对生成的文件进行以下修改(或者你可以使用“正确”的方式,包括etex.sys等,如马塞尔·克鲁格的答案):

    @!mem_max=30000; {greatest index in \TeX's  |   @!mem_max=400000; {greatest index in \TeX'
    @!stack_size=200; {maximum number of simul  |   @!stack_size=1000; {maximum number of simu
    @!max_in_open=6; {maximum number of input   |   @!max_in_open=15; {maximum number of input
    @!max_strings=3000; {maximum number of str  |   @!max_strings=60000; {maximum number of st
    @!string_vacancies=8000; {the minimum numb  |   @!string_vacancies=300000; {the minimum nu
    @!pool_size=32000; {maximum number of char  |   @!pool_size=350000; {maximum number of cha
    @!trie_size=8000; {space for hyphenation p  |   @!trie_size=600000; {space for hyphenation
    @!trie_op_size=500; {space for ``opcodes''  |   @!trie_op_size=10000; {space for ``opcodes
    @d mem_top==30000 {largest index in the |m  |   @d mem_top==400000 {largest index in the |
    @d hash_size=2100 {maximum number of contr  |   @d hash_size=15000 {maximum number of cont
    @d hyph_size=307 {another prime; the numbe  |   @d hyph_size=2003 {another prime; the numb
    for i:=0 to @'37 do xchr[i]:=' ';           |   for i:=0 to @'37 do xchr[i]:=chr(i);
    for i:=@'177 to @'377 do xchr[i]:=' ';      |   for i:=@'177 to @'377 do xchr[i]:=chr(i);
    @d max_quarterword=255 {largest allowable   |   @d max_quarterword=65535 {largest allowabl
    @d max_halfword==65535 {largest allowable   |   @d max_halfword==16777215 {largest allowab
    

    这些是通过经验主义来确定的,通过增加我遇到错误的那些。分配的变化xchr是根据讨论进行的另一个问题

  6. 相应地,编辑四个文件index.js、和initex.js,将其更改为。(实际上,在玩这个的时候,我创建了一个包含pascal/program.jstex.jsvar pages = 20;var pages=290;commonMemory.js

    module.exports = { commonPages: function() { return 290; } };
    

    并使用var pages = require('./commonMemory').commonPages();..。但这只是在确定这个数字 290 时方便而已,你不必这么做。)

  7. 编辑library.js:在函数内部reset,更改此块:

        files.push({
          filename: filename,
          position: 0,
          descriptor: fs.openSync(filename,'r'),
        });
    

        let basename = filename.slice(filename.lastIndexOf('/') + 1);
        const {spawnSync} = require('child_process');
        let realFilename = spawnSync('kpsewhich', [filename]).stdout.toString().trim();
        if (realFilename == '') {
            // try again with basename
            realFilename = spawnSync('kpsewhich', [basename]).stdout.toString().trim();
            if (realFilename == '') {
                // Give up, just create empty file
                spawnSync('touch', [basename]);
                realFilename = basename;
                console.log(`For filename #${filename}# created empty #${basename}#`);
            } else {
                console.log(`Found filename #${filename}# via basename at #${realFilename}#`);
            }
        } else {
            console.log(`Found filename #${filename}# at #${realFilename}#`);
        }
    
        files.push({
          filename: filename,
          position: 0,
          descriptor: fs.openSync(realFilename,'r'),
        });
    

    — 这个想法是,由于创建 LaTeX 格式文件会加载无数文件,其中一些文件甚至没有随 TeX Live 一起分发,因此我们将文件查找挂钩插入其中以查找所有这些文件,如果找不到,则将文件kpsewhich留空。值得一提的是,这些是未找到的文件,并使用了空文件:babel-latex.cfg,,,,。il2enc.dfuomlenc.dfuomxenc.dfuuenc.dfu

  8. 编辑initex.js以转储 LaTeX 格式而不是纯格式(并在进行核心转储时再次进行):

    -library.setInput("\nplain \\dump\n\n",
    +library.setInput("\n*latex.ltx \\dump\n\n",
    

    -library.setInput("\n&plain\n\n",
    +library.setInput("\n&latex\n\n",
    
  9. 用 LaTeX 示例替换 的内容sample.tex。例如,您可以使用(来自这里):

    \documentclass{article}
    \title{Cartesian closed categories and the price of eggs}
    \author{Jane Doe}
    \date{September 1994}
    \begin{document}
       \maketitle
       Hello world!
    \end{document}
    
  10. 获取 web2js 依赖项并构建其 Pascal 解析器:

    npm install
    npm run-script build
    
  11. 构建一切:从 WEB(通过 TANGLE)到 Pascal(通过 web2js)到 WASM,再到加载和转储格式文件和内存转储,然后运行 ​​TeX:

    tangle -underline mytex.web && \
    mv -f mytex.pool tex.pool && \
    node compile.js mytex.p && \
    node initex.js && \
    node tex.js
    

请注意,sample.dvi已成功创建并且看起来不错。所以我们有一个可用的 LaTeX 格式。您可以尝试编辑sample.tex并重新运行node tex.js以排版各种 LaTeX 文档(到 DVI)。

注意事项:

  • 由于替换了这些缺失的文件,非英语语言的连字模式或特定字体编码可能无法正常工作。但我甚至在 TeX Live 源中也找不到这些文件,所以我不确定它们应该包含什么,或者它们是否应该是空的。

  • 这个答案的第一个修订版有一种方法可以构建 LaTeX 格式没有增加max_quarterword/ max_halfword,或增加 JS 端授予的内存页数量。这样做的代价是无法加载大多数语言的连字模式,而且对于加载像 TikZ 这样的重量级包来说也是不够的。当前修订版没有这些问题。

相关内容