基于这问题,我尝试在生成的哈希列表中查找重复项。这些可能存在,因为我不小心使用了相同的X
&Y
值,或者因为哈希的前三位数字相等。在这种情况下,我想给出这样的编译器错误\errmessage{Hash \tempHash already used!}
。
函数内部的一切\calcHash
似乎都非常脆弱,我不知道如何在不破坏哈希计算的情况下添加功能。
在以下 MWE 中,我尝试通过datatool
数据库查找重复项。但这不是强制性的,当有另一种简单的搜索重复项的可能性时,我完全可以接受 :)
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\usepackage{xstring}
\usepackage{datatool}
\pgfplotsset{compat=newest}
\DTLnewdb{hashDB}
\newcommand{\calcHash}[1]{
\noexpand\StrLeft{\pdfmdfivesum{#1}}{3}
\newcommand{\tempHash}{\StrLeft{\pdfmdfivesum{#1}}{3}}
%\DTLifdbempty{hashDB}
%{
%\DTLnewrow{hashDB}
%\DTLnewdbentry{hashDB}{Hash}{\tempHash}
%}
%\DTLforeach{hashDB}{\hash=hash}
%{
%\ifthenelse{\equal{\tempHash}{\hash}}
%{
% \errmessage{Hash \tempHash already used!}
%}{}
%}
}
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\pgfplotstablecreatecol[
create col/assign/.code={
\edef\myHash{\noexpand\calcHash{\thisrow{X}\thisrow{Y}}}
\pgfkeyslet{/pgfplots/table/create col/next content}\myHash
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
答案1
下面使用 L3 序列和 L3 md5sum 函数来实现\calcHash
。请注意 是\calcHash
在原处使用,而不是存储在其他宏中,然后将其分配给next content
。
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_new_protected:Npn \__pascals_calc_hash:n #1
{
\str_set:Ne \l__pascals_hash_str { \str_mdfive_hash:e {#1} }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { m } { \__pascals_calc_hash:n {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
仅使用结果哈希中的前三个标记的变体:
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_generate_variant:Nn \str_range:nnn { e }
\cs_new_protected:Npn \__pascals_calc_hash:n #1
{
\str_set:Ne \l__pascals_hash_str
{ \str_range:enn { \str_mdfive_hash:e {#1} } { 1 } { 3 } }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { m } { \__pascals_calc_hash:n {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
还有另一种变体,默认使用完整哈希,但带有可选参数,仅使用第一个n人物。
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_generate_variant:Nn \str_range:nnn { e }
\cs_new_protected:Npn \__pascals_calc_hash:nn #1#2
{
\str_set:Ne \l__pascals_hash_str
{ \str_range:enn { \str_mdfive_hash:e {#1} } { 1 } {#2} }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { O{-1} m } { \__pascals_calc_hash:nn {#2} {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash[3]{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}