整数列表/集合的快速成员资格测试

整数列表/集合的快速成员资格测试

假设我们有一个很大的整数列表,以逗号分隔的值存储在宏或令牌寄存器中。现在我们想在循环中测试每个整数n= 1, 2, ...,如果n出现在数字列表中。对于小遍历短列表直到找到匹配项就可以了。由于这是 O(n 2 ) 行为,因此一遍又一遍地遍历整个列表可能会显著减慢较大输入的编译速度。

有哪些方法可以更有效的测试整数成员资格?

编辑:根据要求提供更多信息:

该解决方案应该与 pdflatex 一起使用,因此不需要 Lua 代码。expl3解决方案很好。

对于我的用例,可以假设大数字列表按升序排列。即使对于更通用的解决方案,如果提供非排序列表,我们也可以申请\clist_sort:Nn获取排序的输入列表。

我的具体用例是在第一次编译运行时将标记放置在文档中,这些标记由计数器编号。在流程结束时,这些标记计数器(不是全部)的可能很大的列表被写入 .aux 文件并在下次运行时进行处理。在每个标记位置,必须测试列表是否出现该特定元素。在最坏的情况下,我们将拥有列表中的所有元素,并且必须遍历列表直到位置在标记号处,最终给出 O(n 2 ) 行为。

除了这个特定的用例之外,我认为这个问题对于其他问题也可能很有趣。

答案1

这里的经典方法是使用(小)字体作为数组,利用字体尺寸。对于单个数组,我们可以这样做

\font\myintarray = cmr10 at 1sp %
\count255 = 0 %
\loop
  \advance\count255 by 1 %
  \fontdimen\count255 \myintarray = 0sp %
  \ifnum\count255 < 11 %
\repeat
\protected\def\setarray#1#2{%
  \fontdimen#1 \myintarray = #2sp %
}
\def\getarray#1{%
  \number\fontdimen#1 \myintarray
}
\setarray{5}{27}
\count255 = 255 %
\loop
  \advance\count255 by 1 %
  \getarray{\count255 } %
  \ifnum\count255 < 11 %
\repeat
\bye

数组越多,我们就需要进行一些管理(每个数组都必须是单独的字体)。这些结构是全局的,但具有恒定的访问时间(因此映射将具有线性时间)。


expl3这种方法中,它被抽象为intarray数据类型

\intarray_new:Nn \g_my_intarray { 100 }
\intarray_gset:Nnn \g_my_intarray { 5 } { 27 }
\intarray_item:Nn \g_my_intarray { 5 }

就限制而言,关键的是最大值比通常的 TeX 限制低一个幂(2^{30} - 1而不是2^{31} - 1)。据我所知,可以加载的字体数量没有预先确定的限制。但是,fontdimens 的总数(即数组中的项目数)是有限的:在标准设置下,允许 400 万个条目。

答案2

更新:也许只是我一个人这样想,但我无法理解有成本的说法O(n^2)。当然,这可能只是一种误解,因为在您的问题中n用于各种对象。让我们将大整数列表的元素数称为M,其最大元素称为n_max。那么我声称您“仅”需要M+n_max步骤。有无二次依赖性取决于条目数或列表长度等。以下代码解决了更新后的问题:您有一个可能很大的列表,并且想要进行成员资格测试。这是通过 实现的\ProcessList{<list>}{<largest entry>}。详细的实现当然可以改进(我相信您可以添加更多\expandafters 和\ignorespaces等等),但重点是根本没有二次依赖性。

\documentclass{article}
\newcounter{iloop}
\makeatletter
\newcommand{\ProcessList}[2]{\setcounter{iloop}{0}%
\loop%
\stepcounter{iloop}%
\edef\temp{\noexpand\xdef\csname member\roman{iloop}\endcsname{0}}%
\temp%
\ifnum\number\value{iloop}<\the\numexpr#2+1\repeat%
\@for\next:=#1\do{\edef\mynum{\romannumeral\next}%
\expandafter\xdef\csname member\mynum\endcsname{1}}}
\newcommand{\IsInList}[2]{%
\edef\temp{\noexpand\xdef\noexpand#2{\csname member\romannumeral#1\endcsname}}%
\temp}
\makeatother
\begin{document}
% we assume that the list is known as well as its largest element
% they will become the arguments of \ProcessList
% (the largest element can also be found out automatically)
\ProcessList{1,2,3,4,6,9,10,14,19,21,22,25,30,33,%
35,38,39,40,42,44,49,50,59,60,62,63,64,%
66,67,70,71,80,82,83,85,88,89,94,95,96,%
97,99,103,106,107,109,112,116,117,119,121,%
123,126,128,132,133,134,138,139,140,141,%
143,147,148,150,153,155,157,163,165,168,%
170,176,177,178,180,184,186,190,197,202,%
207,208,209,219,220,224,234,235,238,239,%
242,244,247,249,251,259,262,265,267,268,%
270,275,280,283,285,287,288,289,292,300,%
301,303,307,311,313,314,315,318,319,323,%
324,325,326,327,331,337,346,352,354,356,%
361,362,363,366,367,368,369,372,375,377,%
378,382,383,384,388,391,393,394,395,398,%
399,400,402,404,405,407,408,409,412,417,%
421,423,426,434,439,440,443,445,446,448,%
456,461,466,467,468,470,472,477,478,479,%
481,482,483,485,489,493,494,496,500,502,%
505,509,512,514,518,522,527,528,530,531,%
533,535,536,541,545,548,551,553,554,556,%
557,560,562,564,565,566,570,571,572,575,%
577,587,593,600,601,604,605,607,610,611,%
613,614,619,621,622,623,625,632,633,634,%
635,636,637,639,645,648,651,656,661,665,%
666,669,674,677,678,679,680,682,683,684,%
685,687,689,690,693,698,700,703,704,708,%
710,713,714,718,719,729,730,733,737,738,%
741,744,745,746,753,760,761,762,765,770,%
772,775,780,782,783,784,789,790,792,801,%
803,804,806,809,810,814,815,818,822,823,%
824,827,829,833,836,837,838,840,841,843,%
844,847,849,853,854,855,859,864,870,871,%
873,874,876,881,882,885,887,889,890,891,%
892,893,895,900,901,903,908,910,911,913,%
915,917,919,920,922,925,927,928,931,932,%
933,934,935,936,938,942,943,945,951,956,%
959,963,964,966,971,972,974,978,989,993,%
995,997,998}{998}

test if 6 is in the list:\IsInList{6}{\mytest} \mytest

test if 7 is in the list:\IsInList{7}{\mytest} \mytest
\end{document}

在此处输入图片描述

旧答案:为了选择那些不大于的元素,M你只需要对所有元素进行一次循环M。这会给你一个K元素列表。在这个阶段,成本是M。如果你想找出给定的整数是否在大列表中,你也只需要M步骤。

无论如何,这些都是一些基本的例程,它们的作用与此类似。我坚信类似的例程一定存在于某个地方,但我找不到它们。

\documentclass{article}
\newcounter{iloop}
\newif\ifmember
\newif\iflstart
\makeatletter% for \@for see e.g. https://tex.stackexchange.com/a/100684/121799
\newcommand{\MemberQ}[2]{\global\memberfalse%
\@for\next:=#1\do{\ifnum\next=#2\global\membertrue\fi}}
\newcommand{\Preselect}[3]{\edef\itest{\the\numexpr#2+1}%
\lstarttrue%
\@for\next:=#1\do{\ifnum\next<\itest%
\iflstart%
\xdef#3{\next}%
\global\lstartfalse%
\else%
\xdef#3{#3,\next}%
\fi%
\fi}}
\newcommand{\Hits}[3]{\edef#3{-1}%
\lstarttrue%
\setcounter{iloop}{-1}\loop%
\stepcounter{iloop}%
\MemberQ{{#1}}{\number\value{iloop}}%
\ifmember%
\iflstart%
\xdef#3{\number\value{iloop}}%
\global\lstartfalse%
\else%
\xdef#3{#3,\number\value{iloop}}%
\fi\fi%
\ifnum\number\value{iloop}<#2\repeat}
\makeatother
\begin{document}
\subsection*{Tests of MemberQ}
\MemberQ{1,2,3,4}{2}
\ifmember 2 is in list \fi

\MemberQ{1,2,3,4}{5}
\ifmember 2 is in list \fi

\subsection*{Select all members of list which are smaller than or equal to a certain number}
% random list generated by Mathematica
\edef\LstLong{638, 761, 899, 899, 315, 827, 954, 696, 102, 577, 
525, 279, 108, 983, 845, 530, 658, 896, 818, 342, 
515, 946, 62, 632, 495, 784, 218, 583, 624, 761, 
230, 176, 38, 801, 514, 643, 720, 991, 930, 219, 
115, 585, 527, 115, 837, 50, 955, 566, 579, 600, 
184, 987, 212, 941, 966, 63, 192, 973, 801, 322, 
571, 946, 786, 433, 586, 997, 903, 820, 672, 618, 
355, 338, 183, 384, 479, 341, 507, 849, 431, 292, 
470, 927, 93, 460, 518, 865, 257, 712, 351, 732, 
817, 839, 217, 951, 194, 222, 604, 292, 208, 220, 
197, 476, 973, 232, 250, 527, 972, 496, 751, 824, 
334, 342, 751, 484, 883, 526, 644, 424, 368, 410, 
530, 243, 600, 216, 661, 273, 412, 685, 724, 12, 
556, 587, 380, 43, 792, 827, 687, 568, 275, 608, 
893, 863, 825, 741, 831, 406, 855, 83, 279, 290, 
341, 7, 381, 256, 437, 292, 945, 474, 326, 970, 820, 
44, 539, 903, 640, 592, 285, 512, 594, 788, 677, 
197, 787, 927, 400, 239, 220, 342, 14, 902, 677, 
858, 481, 824, 925, 639, 677, 903, 287, 223, 271, 
997, 774, 602, 293, 766, 10, 416, 638, 311, 186, 
729, 613, 31, 930, 219, 357, 887, 88, 579, 985, 446, 
334, 910, 447, 321, 183, 862, 297, 641, 139, 980, 
199, 687, 374, 322, 22, 319, 991, 672, 788, 262, 
828, 389, 684, 178, 958, 492, 597, 803, 259, 386, 
800, 86, 936, 712, 494, 447, 254, 932, 78, 789, 121, 
897, 120, 819, 935, 307, 246, 96, 16, 639, 549, 85, 
867, 509, 960, 690, 301, 348, 440, 792, 117, 157, 
567, 184, 912, 244, 686, 843, 112, 927, 328, 801, 
178, 720, 385, 380, 399, 377, 287, 76, 574, 291, 
731, 430, 670, 466, 758, 104, 825, 23, 502, 821, 
979, 753, 28, 970, 855, 958, 20, 999, 184, 598, 668, 
877, 736, 174, 850, 715, 131, 289, 786, 55, 36, 785, 
129, 851, 411, 677, 493, 913, 405, 630, 695, 582, 
555, 806, 65, 775, 448, 774, 905, 925, 353, 356, 
106, 884, 178, 176, 182, 114, 258, 112, 924, 923, 
853, 959, 300, 652, 729, 141, 14, 493, 94, 281, 668, 
173, 834, 855, 839, 665, 361, 168, 808, 34, 179, 
736, 139, 396, 963, 946, 760, 458, 390, 70, 698, 
846, 979, 597, 410, 194, 888, 97, 852, 770, 572, 
623, 453, 323, 941, 876, 99, 5, 129, 868, 552, 146, 
231, 949, 268, 755, 608, 705, 504, 635, 392, 970, 
654, 785, 295, 761, 684, 146, 482, 162, 541, 818, 
622, 828, 724, 232, 568, 807, 569, 580, 864, 709, 
217, 594, 687, 167, 248, 447, 27, 339, 341, 921, 
508, 923, 962, 430, 240, 62, 688, 212, 176, 478, 
664, 871, 219, 398, 889, 577, 312, 827, 365, 33, 
677, 751, 506, 658, 848, 717, 321, 400, 180, 561, 
926, 515, 932, 839, 828, 997, 355, 42, 334, 854, 
884, 599, 93, 393, 399, 246, 825, 553, 456, 181, 
564, 64}

% selects all elements that are smaller or equal to 97
\Preselect{\LstLong}{97}{\mylist}
\mylist

\MemberQ{\mylist}{5}
5 is \ifmember\else not\fi in the list

\MemberQ{\mylist}{6}
6 is \ifmember\else not\space\fi in the list


% selects all elements that are smaller or equal to 50 and sorts them,
% but is this the output you want
\Hits{\mylist}{50}{\hitlist}
\hitlist
\end{document}

只是为了完整性:成员资格测试不限于整数。(我确信它有许多功能,例如可扩展性等,但它不需要软件包,而且速度似乎相当快。我知道“可扩展”的确切含义,我可能会更加欣赏这个功能。;-)

\documentclass{article}
\newif\ifmember
\makeatletter% for \@for see e.g. https://tex.stackexchange.com/a/100684/121799
\newcommand{\MemberQ}[2]{\global\memberfalse%
    \edef\temp{#2}%
    \@for\next:=#1\do{\ifx\next\temp\relax\global\membertrue\fi}}
\makeatother
\begin{document}

\MemberQ{a,4,7,11}{11} \ifmember in\else out \fi

\MemberQ{a,4,7,11}{3} \ifmember in\else out \fi

\MemberQ{a,4,7,11}{A} \ifmember in\else out \fi

\MemberQ{a,4,7,11}{a} \ifmember in\else out \fi

\end{document}

相关内容