我之前的问题的后续:用相同长度的随机字符串替换文本的宏
感谢 @Mico 的回答,我们现在在 Lua 中有一个宏,可以用随机字符替换 UTF-8 字符串。但是,一个问题是,当使用宏时,代码会假设字符\...{
和}
以及\...
都被计入混淆。这是有问题的,因为对于线框来说,它会导致随机字符串比普通文本更长。有没有办法获得xyz
和\textit{xyz}
具有相同长度的随机 ASCII 输出?
MWE(感谢@Mico)如下:
% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{luacode} % for 'luacode' environment and '\luastring' macro
\begin{luacode}
function rndstring ( inputstring )
local outputstring, choices, mm, nn
mm = unicode.utf8.len(inputstring) -- no. of utf8-encoded characters in input string
-- Place candidate replacement characters in a Lua table:
choices = {
"0","
}--substantially simplified to reduce size -- Number of rows in 'choices' table
nn = #choices
-- Generate the outputstring in a 'for' loop:
outputstring = ""
for i = 1 , mm do
if unicode.utf8.sub ( inputstring , i , i ) == " " then
outputstring = outputstring .. " " -- preserve space char.
else -- choose a new char randomly from 'choices' table
outputstring = outputstring .. choices[ math.random ( nn ) ]
end
end
return ( outputstring )
end
\end{luacode}
%% Define a LaTeX macro to invoke the Lua function
\newcommand\rndstring[1]{\directlua{tex.sprint(rndstring(\luastring{#1}))}}
\begin{document}
\ttfamily
\rndstring{This is a string.}
\rnstring{\textit{This is a String}}
%%%% These two Strings should be (but aren't) the same length
\end{document}
答案1
不幸的是,人们习惯于将 TeX 输入作为常规 Lua 字符串进行处理,而当 TeX 标记发挥作用时,这种做法总会失败。
更让人难过的是,LuaTeX 实际上已经自带了一个处理 TeX 标记的内置库。这样一来,代码不仅变得更加紧凑,而且区分不同类型的标记也变得非常简单。
\documentclass{article}
\usepackage{luacode}
\begin{luacode}
local function rndstring()
local toks = token.scan_toks()
for n, t in ipairs(toks) do
if t.cmdname == "letter" then
-- random number from printable ASCII range
local r = math.random(33, 126)
-- create new token with that character and catcode 12
local letter = token.create(r, 12)
-- replace old token
toks[n] = letter
end
end
token.put_next(toks)
end
local lft = lua.get_functions_table()
lft[#lft + 1] = rndstring
token.set_lua("rndstring", #lft, "global")
\end{luacode}
\begin{document}
\ttfamily
\rndstring{This is a string.}
\rndstring{\textit{This is a String}}
\end{document}
答案2
我认为纯 LaTeX 解决方案更好。
\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
\ExplSyntaxOn
% specify what candidates are in the random replacement
\def\RandomStringASCIIRanges{
%33-47,
%48-57,
58-64,
65-90,
91-96,
97-122,
%123-126
}
\seq_new:N \l_chrepl_all_repl_seq
\clist_new:N \l_chrepl_tmpa_clist
\int_new:N \l_chrepl_tmpa_int
\tl_new:N \l_chrepl_tmpa_tl
\tl_new:N \g_chrepl_tmpa_tl
\tl_new:N \g_chrepl_tmpb_tl
\tl_new:N \l_chrepl_rand_charcode_tl
\tl_new:N \l_chrepl_head_tl
\cs_set:Npn \__chrepl_parse_ascii_range:w |#1-#2| {
\int_step_inline:nnn {#1} {#2} {
\seq_put_right:Nn \l_chrepl_all_repl_seq {##1}
}
}
\cs_set:Npn \__chrepl_parse_ascii_range:n #1 {
\__chrepl_parse_ascii_range:w |#1|
}
% parse the ranges
\clist_set:NV \l_chrepl_tmpa_clist \RandomStringASCIIRanges
\clist_map_function:NN \l_chrepl_tmpa_clist \__chrepl_parse_ascii_range:n
% construct an intarray for fast access
\intarray_new:Nn \g_chrepl_repl_intarray {\seq_count:N \l_chrepl_all_repl_seq}
\int_set:Nn \l_chrepl_tmpa_int {1} % loop index
\seq_map_inline:Nn \l_chrepl_all_repl_seq {
\intarray_gset:Nnn \g_chrepl_repl_intarray {\l_chrepl_tmpa_int} {#1}
\int_incr:N \l_chrepl_tmpa_int
}
\cs_set:Npn \__chrepl_temp_var:n #1 {
__g_chrepl_temp_#1_tl
}
\cs_set:Npn \__chrepl_group:n #1 {
\exp_not:n { {#1} }
}
% a recursive replacement algorithm
\cs_set:Npn \chrepl_repl:Nnn #1#2#3 {
\group_begin:
\tl_if_empty:nF {#2} {
% check if head is space
% if head is space, insert it back
\tl_if_head_is_space:nTF {#2} {
\tl_gput_right:Nn #1 {\ }
% recursive call (skip spaces)
\exp_args:Nnx \chrepl_repl:Nnn #1 {\tl_trim_spaces:n {#2}} {#3}
} {
\tl_if_head_is_group:nTF {#2} {
% the results in this group needs to be written to a unique temp variable
% clear the temp var. corresponding to this level
\tl_gclear:c {\__chrepl_temp_var:n {#3}}
\chrepl_repl:cxx {\__chrepl_temp_var:n {#3}} {\tl_head:n {#2}} {\int_eval:n {#3 + 1}}
\tl_set_eq:Nc \l_chrepl_tmpa_tl {\__chrepl_temp_var:n {#3}}
\tl_gput_right:Nx #1 {
\exp_args:NV \__chrepl_group:n \l_chrepl_tmpa_tl
}
} {
% extract the head
\tl_set:Nx \l_chrepl_head_tl {\tl_head:n {#2}}
\tl_if_empty:NF \l_chrepl_head_tl {
% if head is control sequence, insert it back
\exp_args:NV \token_if_cs:NTF \l_chrepl_head_tl {
\tl_show:N \l_chrepl_head_tl
\tl_gput_right:NV #1 \l_chrepl_head_tl
} {
% otherwise, do replacement
% randomly pick a charcode from the intarray
\tl_set:Nx \l_chrepl_rand_charcode_tl {\intarray_rand_item:N \g_chrepl_repl_intarray}
% generate the corresponding character
\tl_gput_right:Nx #1 {\char_generate:nn {\l_chrepl_rand_charcode_tl} {12}}
}
}
}
% recursive call
\exp_args:Nnx \chrepl_repl:Nnn #1 {\tl_tail:n {#2}} {#3}
}
}
\group_end:
}
\cs_generate_variant:Nn \chrepl_repl:Nnn {cxx}
% user function
\newcommand{\rndstr}[1]{
\tl_gclear:N \g_chrepl_tmpa_tl % used to store results
\chrepl_repl:Nnn \g_chrepl_tmpa_tl {#1} {1}
\tl_show:N \g_chrepl_tmpa_tl
\tl_use:N \g_chrepl_tmpa_tl
}
\ExplSyntaxOff
\texttt{\rndstr{Hello World}}
\texttt{\rndstr{Hello Владимир öäüß}}
\texttt{\rndstr{this \textsl{ab{\huge\bfseries cdef}gh}} nested groups.}
\texttt{\rndstr{this {ab{cdef}gh}} nested groups.}
\texttt{\rndstr{this abcdefgh nested groups.}}
\texttt{\rndstr{this \{abcdefgh\} nested groups.}}
\texttt{\rndstr{Once upon a time, there was ...}}
\end{document}