代码审查：expl3 中的 LaTeX 分析

Question 1

首先，你对这个包做得非常好！我没有发现任何重大问题，只是一些风格上的问题：

\ProvidesExplPackage{profiling}
  {2023/08/08}
  {1.0}
  {A package for profiling code using expl3}

\NeedsTeXFormat{LaTeX2e}[2020/10/01]由于您的包需要钩子支持，所以我会在此之前添加。

\ExplSyntaxOn

\ProvidesExplPackage已经有了\ExplSyntaxOn，所以这里不需要它。

\RequirePackage{expl3,l3keys2e}

\ProvidesExplPackage如果没有，和\ExplSyntaxOn将无法工作expl3，因此如果您已经到了这一步，expl3已经加载。您可以更早地加载包，但由于您在此包中使用了钩子，因此您已经需要一个包含的 LaTeX 内核expl3。

\bool_new:N \g_profiling_enabled_bool
\bool_new:N \g_profiling_total_bool
\bool_new:N \g_profiling_packages_bool
\bool_new:N \g_profiling_document_bool
\bool_new:N \g_profiling_preamble_bool
\bool_set_true:N \g_profiling_enabled_bool
\bool_set_true:N \g_profiling_total_bool
\bool_set_true:N \g_profiling_packages_bool
\bool_set_true:N \g_profiling_document_bool
\bool_set_true:N \g_profiling_preamble_bool

\int_new:N \g_profiling_max_level_int

\tl_new:N \g_profiling_file_name_tl

由于 key-val 设置将这些变量初始化为相同的值，因此是多余的。

此外，这些变量可能应该是“内部的”，因此用来\g__…表示它。

\keys_define:nn { profiling }
{
    disable  .bool_gset_inverse:N = \g_profiling_enabled_bool,
    disable  .default:n  = true,
    total .bool_gset:N = \g_profiling_total_bool,
    total .default:n  = true,
    packages .bool_gset:N = \g_profiling_packages_bool,
    packages .default:n  = true,
    document .bool_gset:N = \g_profiling_document_bool,
    document .default:n  = true,
    preamble .bool_gset:N = \g_profiling_preamble_bool,
    preamble .default:n  = true,
    file-name .tl_gset:N = \g_profiling_file_name_tl,
    file-name .initial:n  = profiling,
    max-level .int_gset:N = \g_profiling_max_level_int,
    max-level .initial:n = 9999,
}

我建议每个新键之间都有一个空白行，但这只是个人喜好。

\ProcessKeysOptions{profiling}

内核包含自己的具有较新版本的 key-val 处理器，因此我建议使用以下内容：

\IfFormatAtLeastTF { 2022-06-01 } {
    \ProcessKeyOptions [ profiling ]
}{
    \RequirePackage { l3keys2e }
    \ProcessKeysOptions { profiling }
}

\bool_if:NF \g_profiling_enabled_bool
{
    \NewDocumentCommand\StartProfiling{ m }{}
    \NewDocumentCommand\StopProfiling{ m }{}
    \NewDocumentCommand\WriteProfilingResults{ O{profiling} }{}
    \NewDocumentCommand\ProfileMacro{ m }{}
    \ExplSyntaxOff

\ExplSyntaxOff是多余的。

    \endinput

用。。。来代替\file_input_stop:。

}

% required variants
\cs_generate_variant:Nn \int_gset:Nn { cf }
\cs_generate_variant:Nn \int_gset:Nn { Nf }

f除非有非常特殊的原因，否则不要使用变体。请使用e 替代。

% initialize a list of tags to output all the data at the end
\seq_new:N \g_profiling_all_instances_seq

% initialize a second list of tags to keep track of the scope that is currently being profiled
\seq_new:N \g_profiling_instances_seq

% initialize a counter to keep track of the current scope level
\int_new:N \g_profiling_scope_int

% global start and end time
\int_new:N \g_profiling_end_int
\int_new:N \g_profiling_start_int
\int_gset:Nf \g_profiling_start_int { \sys_timer: }

\g_profiling_start_int从未使用过（并且无论如何作为常量会更好）。

% define a message for when the scope doesn't match
\msg_new:nnn { profiling } { scope-mismatch }
{
    The~scope~'#1'~does~not~match~the~current~scope~'#2'.~
    Maybe~you~forgot~to~close~a~scope~or~you~closed~a~scope~too~early?
}

\msg_new:nnn { profiling } { instance-exists }
{
    The~profiling~instance~'#1'~already~exists.~This~is~a~bug!
}

我通常看到消息一般在包的开头或使用它们的地方附近定义。在中间这样做是不合常规的（尽管无论如何都不是错误的）。

% ---------------------------------------------------------------------------- %
%                                Start Profiling                               %
% ---------------------------------------------------------------------------- %
% Start profiling for a given tag. The tag might be profiled multiple times.
% Each time it is profiled, it will be called a new "instance".
% The general instance name will be <tag>/<instance number>.
% If the instance name is specified, the actual profiling will be started in
% \profiling_start_instance:n
\cs_new:Npn \profiling_start:n #1 {

此命令主体不可扩展，因此您应该使用 \cs_new_protected:Npn。事实上，您应该 \cs_new_protected:Npn对您在此包中定义的所有命令使用。

    % check if the tag already exists
    \int_if_exist:cTF { g_profiling_tag_#1_total_int }
    {
        \int_gincr:c { g_profiling_tag_#1_total_int }
    }{
        \seq_new:c { g_profiling_tag_#1_recurse_seq }
        \int_new:c { g_profiling_tag_#1_total_int }
        \int_gset:cn { g_profiling_tag_#1_total_int } { 1 }
    }
    \seq_gput_right:cx
        { g_profiling_tag_#1_recurse_seq }
        { \int_use:c { g_profiling_tag_#1_total_int } }
    \seq_get_right:cN { g_profiling_tag_#1_recurse_seq } \l_tmpa_tl
    \profiling_start_instance:x { #1/\tl_use:N \l_tmpa_tl }
}

我建议不要通过 csname 检索标签数据，而是使用属性列表。您可以将<tag name>作为键和 seq(total, recurse)作为值，或者<tag name>-total将和<tag name>-recurse作为键并将其值作为值。

% Define a function to record the current timestamp and store it in a new macro
% with the name '\g_profiling_<tag>_start_int'.
\cs_new:Npn \profiling_start_instance:n #1
{
    \int_if_exist:cTF { g_profiling_#1_start_int }
    {
        % warning if the tag already exists
        \msg_critical:nnn { profiling } { instance-exists } { #1 }

critical在这里太严厉了。我建议warning，但error 也可以。

    }{
        \int_new:c { g_profiling_#1_start_int }
        \int_gset:cf { g_profiling_#1_start_int } { \sys_timer: }
        \seq_gput_right:Nn \g_profiling_all_instances_seq { #1 }
        \seq_gput_right:Nn \g_profiling_instances_seq { #1 }
        \int_gincr:N \g_profiling_scope_int
    }
}


\cs_generate_variant:Nn \profiling_start:n { x }
\cs_generate_variant:Nn \profiling_start_instance:n { x }

我更喜欢e变体而非x变体。

% ---------------------------------------------------------------------------- %
%                                Stop Profiling                                %
% ---------------------------------------------------------------------------- %
% Stop profiling for a given tag. The tag might be profiled multiple times.
% This function will stop the most recent instance of the tag (also, if the tag
% has been started recursively)
\cs_new:Npn \profiling_stop:n #1 {
    \seq_gpop_right:cN { g_profiling_tag_#1_recurse_seq } \l_tmpa_tl
    \profiling_stop_instance:x { #1/\tl_use:N \l_tmpa_tl }

空格会被忽略，因此我会写得#1 / \tl_use:N \l_tmpa_tl更易于阅读。

}

\cs_new:Npn \profiling_stop_instance:n #1
{
    % check if the currently profiled instance is the same as the instance that should be stopped
    % the scopes of the profiled instances may not overlap!
    \seq_gpop_right:NN \g_profiling_instances_seq \l_tmpa_tl
    \tl_set:Nf \l_tmpb_tl { #1 }
    \tl_if_eq:NNTF \l_tmpa_tl \l_tmpb_tl

跳过\tl_set:Nf \l_tmpb_tl, 而改用 \tl_if_eq:NnTF \l_tmpa_tl { #1 } …。

    {
        % if it is, save the data in a macro and store it in a list of results
        % it will be written to the file at the end of the document to avoid influencing the timing
        \int_new:c   { g_profiling_#1_end_int }
        \int_gset:cf { g_profiling_#1_end_int } { \sys_timer: }

        \int_new:c      { g_profiling_#1_scope_int }
        \int_gset_eq:cN { g_profiling_#1_scope_int } \g_profiling_scope_int

        % decrement the scope counter
        \int_gdecr:N \g_profiling_scope_int
    }
    {
        % if it isn't, then something went wrong
        \msg_critical:nnxx
            { profiling }
            { scope-mismatch }
            { \tl_use:N \l_tmpb_tl }
            { \tl_use:N \l_tmpa_tl }

再次，更喜欢warning或error。

    }
}

% ---------------------------------------------------------------------------- %
\cs_generate_variant:Nn \profiling_stop:n { x }
\cs_generate_variant:Nn \profiling_stop_instance:n { x }

% ---------------------------------------------------------------------------- %
%                                 Write Results                                %
% ---------------------------------------------------------------------------- %
\fp_new:N \l_profiling_diff_fp
\fp_new:N \l_profiling_perc_fp
\iow_new:N \g_profiling_iow

% This function will write the results of a single instance to the file
% #1: tag
% #2: scope
% #3: start (unscaled)
% #4: end (unscaled)
% #5: total time (scaled!)
\cs_new:Npn \profiling_write_line:nnnnn #1 #2 #3 #4 #5{
    % time difference ( * 1/(2^16) )
    \fp_set:Nn \l_tmpa_fp { #3 * 0.0000152587890625 }

我建议使用\fp_const:Nn \c__profiling_scale_fp { 2 ^ -16 } 而不是手动输入的硬编码数字。

    \fp_set:Nn \l_tmpb_fp { #4 * 0.0000152587890625 }
    \fp_set:Nn \l_profiling_diff_fp { \l_tmpb_fp - \l_tmpa_fp }

先用ints 进行减法以避免任何精度损失。因此，\int_set \l_tmpa_int { #4 - #3 }先用，然后用 \fp_set:Nn \l_tmpa_fp { \l_tmpa_int * \c__profiling_scale_fp }。


    % percentage of the total time
    \fp_set:Nn \l_profiling_perc_fp { \l_profiling_diff_fp / #5 * 100 }

    \iow_now:Nx \g_profiling_iow {
        #1, % tag
        #2, % scope
        \fp_use:N \l_tmpa_fp, % start
        \fp_use:N \l_tmpb_fp, % end
        \fp_use:N \l_profiling_diff_fp, % difference
        \fp_use:N \l_profiling_perc_fp % percentage
    }
}
% ---------------------------------------------------------------------------- %

\fp_new:N \l_profiling_total_fp
\int_new:N \l_profiling_scope_int

% This function will write the results of all instances to the file
\cs_new:Npn \profiling_write_results:n #1
{
    % if profiling of packages is enabled, remove the first element of the list of tags,
    % as it is the name of the current file
    \bool_if:NT \g_profiling_packages_bool
    {
        \seq_gpop_left:NN \g_profiling_all_instances_seq \l_tmpa_tl
    }

    % get current time (latest possible moment)
    \int_gset:Nf \g_profiling_end_int { \sys_timer: }
    \fp_set:Nn \l_profiling_total_fp { \g_profiling_end_int * 0.0000152587890625 }

    % open a new file named 'profiling.csv' to write the data
    \iow_open:Nn \g_profiling_iow { \tl_use:N \g_profiling_file_name_tl .csv }

这里仅仅为了实现的原因而扩展了文件名，因此使用 \iow_open:Ne会更好。

    % write the header
    \iow_now:Nx \g_profiling_iow{Tag,Scope,Start,End,Difference,Percentage}

    % write total time (if requested)
    \bool_if:NT \g_profiling_total_bool {
        \profiling_write_line:nnnnn
            { total }
            { 0 }
            { 0 }
            { \g_profiling_end_int }
            { \fp_use:N \l_profiling_total_fp }
    }

    % loop through the list of tags and write the data to the file
    \seq_map_inline:Nn \g_profiling_all_instances_seq
    {
        \int_compare:nNnT
            { \int_use:c {g_profiling_##1_scope_int} } <

改用\int_compare:cNnT { g_profiling_##1_scope_int } …。

            { \int_use:N \g_profiling_max_level_int + 1 }

这里不需要，\int_use:N因为它本身\int_compare会扩展任何 int变量。

        {
            \profiling_write_line:nnnnn
                { ##1 }
                { \int_use:c {g_profiling_##1_scope_int} }
                { \int_use:c {g_profiling_##1_start_int} }
                { \int_use:c {g_profiling_##1_end_int} }
                { \fp_use:N \l_profiling_total_fp }

这就是v变体的用途。假设您已经定义了变体，则应该使用 \profiling_write_line:nvvvv { ##1 } { g_profiling_##1_scope_int } … { \l_profiling_total_fp }。

        }
    }
}

% ---------------------------------------------------------------------------- %
%                             User level functions                             %
% ---------------------------------------------------------------------------- %
% Create user-level functions for starting and stopping profiling
\NewDocumentCommand\StartProfiling{ m }{
    \profiling_start:x { #1 }
}

\NewDocumentCommand\StopProfiling{ m }{
    \profiling_stop:x { #1 }
}

\NewDocumentCommand\WriteProfilingResults{O{profiling}}{
    \profiling_write_results:n {#1}
}

\NewDocumentCommand{\ProfileMacro}{ m }{
    \AddToHook{cmd/#1/before}[profiling/start]{ \profiling_start:x { m/#1 } }
    \AddToHook{cmd/#1/after}[profiling/stop]{ \profiling_stop:x { m/#1 } }
}

我建议使用更多空格。例如：

\NewDocumentCommand \ProfileMacro { m } {
    \AddToHook
        { cmd / #1 / before }
        [ profiling / start ]
        { \profiling_start:x { m / #1 } }

    \AddToHook
        { cmd / #1 / after }
        [ profiling / stop ]
        { \profiling_stop:x { m / #1 } }
}

除上述情况外，我没有发现任何其他明显问题。（我也没有测试过我的任何建议，所以如果某些方法不起作用，那可能是我的错。）

我在文档和序言中没有测量的“空白”在哪里？

这就是激活分析之前发生的所有事情。只需\g_profiling_start_int从中减去\g_profiling_end_int即可将其删除。（查看代码，我猜您本来打算这样做，但忘了。）

有没有办法不用 python 就能产生这种视觉表现？也许第二次运行的时候用 pgfplots？

pgfplots可以相当容易地解析 CSV 文件（查看pgfplots和 pgfplotstable手册），并且您应该能够使用polar pgfplots库和正确的设置获得类似的输出。

Answer