我必须从 CSV 数据生成多个表格,这些表格长达几百页甚至更多。因此,我需要某种自动化功能来实现这一点。
请注意以下几点:
- 每行数据有两个主要部分,第一列是某种名称,其余列是数据。
- CSV 数据中的列数因表而异。但单个表中的列数是固定的。
- 数据是在软件应用程序中自动生成的。表中的行数是可变的,并且可能跨越多页。因此,我需要使用
longtable
。 - 主类别下有可变数量的列。也就是说,每个类别下的子类别数量是不固定的。
- 由于生成这些数据的应用程序也是本地的(自建的),因此数据格式可以定制(在一定程度上)。
- 表格中的行数因表格而异,并且可能(垂直)超过一页。
- 当表格水平分割时,必须重复一些最左边的列。
CSV 文件看起来将会像这样。
Name,4:Category 1,3:Category 2,6:Category 3,8:Category 4,5:Category 5,7:Category 6 Sub-category,Sc11,Sc12,Sc13,Sc14,Sc21,Sc22,Sc23,Sc31,Sc32,Sc33,Sc34,Sc35,Sc36,Sc41,Sc42,Sc43,Sc44,Sc45,Sc46,Sc47,Sc48,Sc51,Sc52,Sc53,Sc54,Sc55,Sc61,Sc62,Sc63,Sc64,Sc65,Sc66,Sc67 Name 1,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67 Name 2,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67
每个数字,米前面的Category n
表示该类别下的子类别数。例如,4:Category 1
表示 下有 4 个子类别Category 1
。
虽然上面的内容或多或少是纯的CSV 输入,在每一行周围放置某种预定义的宏命令,以及在整个表周围放置环境也是可能的。因此,一个表的输入可能如下所示:
\begin{csvwidetable}{1}
\csvwidetableheader{Name,4:Category 1,3:Category 2,6:Category 3,8:Category 4,5:Category 5,7:Category 6}
\csvwidetablescheader{Sub-category,Sc11,Sc12,Sc13,Sc14,Sc21,Sc22,Sc23,Sc31,Sc32,Sc33,Sc34,Sc35,Sc36,Sc41,Sc42,Sc43,Sc44,Sc45,Sc46,Sc47,Sc48,Sc51,Sc52,Sc53,Sc54,Sc55,Sc61,Sc62,Sc63,Sc64,Sc65,Sc66,Sc67}
\csvwidetablerow{Name 1,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67}
\csvwidetablerow{Name 2,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67}
\end{csvwidetable}
这里的1
环境参数表明最左边的 1 列必须重复。
请参阅下面的代码,我相信它应该能够解释我想要实现的效果。
\begin{document}
\newlength{\ncw}
\setlength{\ncw}{10.00mm}
If we resort to putting each row of CSV in a single row of the table,
it will not fit. Moreover, we will need to find a way to put those
headers like `Category 1` over multiple columns.
\begin{longtable}[l]{|l|*{33}{C{\ncw}|}}
\hline
Name&\multicolumn{4}{|c|}{Category 1}&\multicolumn{3}{c|}{Category 2}&\multicolumn{6}{c|}{Category 3}&\multicolumn{8}{|c|}{Category 4}&\multicolumn{5}{c|}{Category 5}&\multicolumn{7}{c|}{Category 6}\\\hline
Sub-category&Sc11&Sc12&Sc13&Sc14&Sc21&Sc22&Sc23&Sc31&Sc32&Sc33&Sc34&Sc35&Sc36&Sc41&Sc42&Sc43&Sc44&Sc45&Sc46&Sc47&Sc48&Sc51&Sc52&Sc53&Sc54&Sc55&Sc61&Sc62&Sc63&Sc64&Sc65&Sc66&Sc67\\\hline
\endhead
\hline
\endfoot
Name 1&11&12&13&14&21&22&23&31&32&33&34&35&36&41&42&43&44&45&46&47&48&51&52&53&54&55&61&62&63&64&65&66&67\\\hline
Name 2&11&12&13&14&21&22&23&31&32&33&34&35&36&41&42&43&44&45&46&47&48&51&52&53&54&55&61&62&63&64&65&66&67\\
\end{longtable}
As of this moment, we employ some pre-fixed heuristics in the
application to break columns at convenient points. We need to
repeat the left column. See below for examples.
In this table, I can accommodate up to Category 3.
\begin{longtable}[l]{|l|*{13}{C{\ncw}|}}
\hline
Name&\multicolumn{4}{|c|}{Category 1}&\multicolumn{3}{c|}{Category 2}&\multicolumn{6}{c|}{Category 3}\\\hline
Sub-category&Sc11&Sc12&Sc13&Sc14&Sc21&Sc22&Sc23&Sc31&Sc32&Sc33&Sc34&Sc35&Sc36\\\hline
\endhead
\hline
\endfoot
Name 1&11&12&13&14&21&22&23&31&32&33&34&35&36\\\hline
Name 2&11&12&13&14&21&22&23&31&32&33&34&35&36\\
\end{longtable}
The rest (Category 4--Category 6) goes in this table.
\begin{longtable}[l]{|l|*{20}{C{\ncw}|}}
\hline
Name&\multicolumn{8}{|c|}{Category 4}&\multicolumn{5}{c|}{Category 5}&\multicolumn{7}{c|}{Category 6}\\\hline
Sub-category&Sc41&Sc42&Sc43&Sc44&Sc45&Sc46&Sc47&Sc48&Sc51&Sc52&Sc53&Sc54&Sc55&Sc61&Sc62&Sc63&Sc64&Sc65&Sc66&Sc67\\\hline
\endhead
\hline
\endfoot
Name 1&41&42&43&44&45&46&47&48&51&52&53&54&55&61&62&63&64&65&66&67\\\hline
Name 2&41&42&43&44&45&46&47&48&51&52&53&54&55&61&62&63&64&65&66&67\\
\end{longtable}
\end{document}
这是上述代码的输出。
因此,我们试图实现的是从 CSV 数据生成长表,当宽度超过文本宽度时在方便的列处分列(而不在任何列的中间分列Category n
),并且(可选)重复一些最左边的列。
(请忽略垂直线的问题。我知道表格中的垂直线在这个论坛上并没有引起太大的反响。)
答案1
这是一个可能的解决方案。它使用datatooltk
从 CSV 文件中分离出第一行。datatooltk
在 LaTeX 之前运行或使用 运行 LaTeX -shell-escape
。(选项 需要 1.8 版本--csv-skiplines
。)
假设:
- 第一列(
Name
)没有子类别,并且始终出现在每个表块的开头。 - 行数未超出页面高度。(如果出现这种情况,请替换
tabular
为longtable
。)
梅威瑟:
\documentclass{article}
\usepackage[a4paper]{geometry}
\usepackage{datatool}
\immediate\write18{datatooltk --name datablocks --csv testdata.csv --nocsv-header --truncate 1 -o datablocks.dbtex}
\immediate\write18{datatooltk --name data --csv testdata.csv --csv-header --csv-skiplines 1 -o data.dbtex}
\DTLloaddbtex{\datablocks}{datablocks.dbtex}
\DTLloaddbtex{\data}{data.dbtex}
% split <n>:header
\def\parseblockheader#1:#2\endparseblock{%
\def\blockspan{#1}%
\def\blockheader{#2}%
}
\newcount\blockidx
\newcount\colidx
% First column is the name, which is a special case.
\blockidx=1
\DTLgetvalue{\currentvalue}{\datablocks}{1}{\blockidx}
\cslet{blockheader\number\blockidx}{\currentvalue}
\csdef{blockrange\number\blockidx}{1}
\csdef{blockspan\number\blockidx}{1}
\colidx=1
\loop
\advance\blockidx by 1\relax
\ifnum\blockidx>\DTLcolumncount\datablocks
\else
\DTLgetvalue{\currentvalue}{\datablocks}{1}{\blockidx}%
\ifdefempty\currentvalue
{% empty columns caused by discrepancy between column count on
% line 1 of CSV being less than column count of remaining lines
\edef\totalblocks{\number\numexpr\blockidx-1}%
% break loop
\blockidx=\DTLcolumncount{\datablocks}%
}%
{%
\expandafter\parseblockheader\currentvalue\endparseblock
\cslet{blockheader\number\blockidx}{\blockheader}%
\cslet{blockspan\number\blockidx}{\blockspan}%
\edef\endrange{\number\numexpr\colidx+\blockspan}%
\global\advance\colidx by 1\relax
\csedef{blockrange\number\blockidx}{\number\numexpr\colidx}%
{%
\loop
\global\advance\colidx by 1
\csxappto{blockrange\number\blockidx}{,\number\numexpr\colidx}%
\ifnum\colidx<\endrange
\repeat
}%
}%
\fi
\ifnum\blockidx<\DTLcolumncount{\datablocks}
\repeat
\makeatletter
% iterate over columns in given block
\newcommand*{\forblock}[2]{%
\letcs{\rangelist}{blockrange\number#1}%
\@for\thiscol:=\rangelist\do{#2}%
}
% iterate over columns in given table
\newcommand*{\fortableblock}[2]{%
\letcs{\blockrangelist}{blocktablecolspan\number#1}%
\@for\thiscol:=\blockrangelist\do{#2}%
}
% iterate over blocks in given table
\newcommand*{\fortableblockrange}[2]{%
\letcs{\blockrangelist}{blocktablerange\number#1}%
\@for\thisblock:=\blockrangelist\do{#2}%
}
\makeatother
% find maximum widths (including \tabcolsep)
\newcommand*{\defandsetlength}[2]{%
\global\newlength{#1}%
\global\setlength{#1}{#2}%
}
\newcount\rowidx
% in case header needs to use a different font:
\newcommand{\headerfont}[1]{#1}
\newcommand{\computeblockwidths}{%
% compute header widths
\dtlforeachkey(\thiskey,\thiscol,\thistype,\thisheader)\in\data\do
{%
\settowidth{\dimen0}{\headerfont\thisheader}%
\dimen0=\dimexpr\dimen0+2\tabcolsep\relax
\expandafter\defandsetlength
\csname columnwidth\number\thiscol\endcsname{\dimen0}%
% save header
\cslet{columnheader\number\thiscol}{\thisheader}%
}%
\loop
\advance\rowidx by 1\relax
{%
\blockidx = 0\relax
\loop
\advance\blockidx by 1\relax
\forblock{\blockidx}{%
\DTLgetvalue{\currentvalue}{\data}{\rowidx}{\thiscol}%
\settowidth{\dimen0}{\currentvalue}%
\dimen0=\dimexpr\dimen0+2\tabcolsep\relax
\ifcsdef{columnwidth\number\thiscol}
{%
\ifnum\dimen0>\csname columnwidth\number\thiscol\endcsname
\csname columnwidth\number\thiscol\endcsname=\dimen0
\fi
}%
{%
\expandafter\defandsetlength
\csname columnwidth\number\thiscol\endcsname{\dimen0}%
}%
}%
\ifnum\blockidx<\totalblocks
\repeat
}%
\ifnum\rowidx<\DTLrowcount{\data}
\repeat
% compute block widths
\blockidx=0
\loop
\advance\blockidx by 1\relax
\expandafter\newlength\csname blockwidth\number\blockidx\endcsname
\forblock{\blockidx}{%
\advance\csname blockwidth\number\blockidx\endcsname by
\csname columnwidth\number\thiscol\endcsname
}%
% check if block headers are wider
\settowidth{\dimen0}%
{\headerfont{\csname blockheader\number\blockidx\endcsname}}%
\dimen0=\dimexpr\dimen0+2\tabcolsep\relax
\ifdim\dimen0>\csname blockwidth\number\blockidx\endcsname
\csname blockwidth\number\blockidx\endcsname=\dimen0\relax
\fi
\ifnum\blockidx<\totalblocks
\repeat
}
% create table code
\newlength\currentwidth
\newcount\currenttable
\newcount\maxtables
\newcommand{\createtable}[1]{%
\csgdef{blocktablecolspec\number#1}{l}%
\csxdef{blocktablecolspan\number#1}{\csuse{blockrange1}}%
\csxdef{blocktablerange\number#1}{1}%
}
\newcommand{\dotable}{%
\computeblockwidths
% first column always present
\global\currentwidth=\csname blockwidth1\endcsname
\global\currenttable=1\relax
\createtable{1}%
% loop over remaining blocks
\blockidx=1
\loop
\advance\blockidx by 1\relax
\global\currentwidth=\dimexpr\currentwidth
+\csname blockwidth\number\blockidx\endcsname\relax
\relax
\ifdim\currentwidth>\linewidth
\global\currentwidth=\dimexpr\csname blockwidth1\endcsname
+\csname blockwidth\number\blockidx\endcsname\relax
\global\advance\currenttable by 1\relax
\createtable\currenttable
\fi
\csgappto{blocktablecolspec\number\currenttable}{|}%
{%
\colidx=0
\loop
\advance\colidx by 1\relax
\csgappto{blocktablecolspec\number\currenttable}{l}%
\ifnum\colidx<\csname blockspan\number\blockidx\endcsname
\repeat
}%
\csxappto{blocktablecolspan\number\currenttable}%
{,\csuse{blockrange\number\blockidx}}%
\csxappto{blocktablerange\number\currenttable}{,\number\blockidx}%
\ifnum\blockidx<\totalblocks
\repeat
% save table count
\maxtables=\currenttable
% create table code
\currenttable=0\relax
\loop
\advance\currenttable by 1\relax
% tabular setup
\def\currenttablecode{\begin{tabular}}%
\eappto\currenttablecode{%
{|\csname blocktablecolspec\number\currenttable\endcsname|}%
\noexpand\hline}%
% block headers
\fortableblockrange{\currenttable}{%
\ifnum\thisblock>1
\appto\currenttablecode{&}%
\fi
\eappto\currenttablecode{%
\noexpand\multicolumn
{\csname blockspan\number\thisblock\endcsname}
{|c|}%
{%
\noexpand\headerfont{%
\expandonce{\csname blockheader\number\thisblock\endcsname}}%
}%
}%
}%
% column sub-headers
\appto\currenttablecode{\\}%
\fortableblock{\currenttable}{%
% get header
\letcs{\thisheader}{columnheader\number\thiscol}%
\ifnum\thiscol>1
\appto\currenttablecode{&}%
\fi
\eappto\currenttablecode{\noexpand\headerfont{\expandonce\thisheader}}%
}%
% table body
{%
\rowidx=0
\loop
\advance\rowidx by 1\relax
\gappto\currenttablecode{\\\hline}%
\fortableblock{\currenttable}{%
\DTLgetvalue{\currentvalue}{\data}{\rowidx}{\thiscol}%
\ifnum\thiscol>1
\gappto\currenttablecode{&}%
\fi
\xappto\currenttablecode{\expandonce\currentvalue}%
}%
\ifnum\rowidx<\DTLrowcount{\data}
\repeat
}%
% tabular end
\appto\currenttablecode{\\\hline\end{tabular}}%
% do table
\bigskip\par\noindent\currenttablecode
\ifnum\currenttable<\maxtables
\repeat
}
\begin{document}
\dotable
\end{document}
结果:
答案2
这是一个部分解决方案,因为它不能自动构建标题,但我认为它可能很有用,因为至少您不必重新输入数据。
如果您设法在 .csv 的第一行之前 放置一个#
或一个,您可以尝试以下操作:%
\documentclass{article}
\usepackage{longtable}
\usepackage{array}
\newcolumntype{C}[1]{>{\centering\arraybackslash\hspace{0pt}}p{#1}}
\usepackage[legalpaper,landscape,left=25.0mm,right=25.0mm]{geometry}
\parindent 0.0mm
\usepackage{pgfplotstable}
\pgfplotsset{compat=1.14}
\usepackage{filecontents}
% The following code lines are added only to create myfile.csv. Of course, you don't need them in your code
\begin{filecontents*}{myfile.csv}
#Name,4:Category 1,3:Category 2,6:Category 3,8:Category 4,5:Category 5,7:Category 6
Sub-category,Sc11,Sc12,Sc13,Sc14,Sc21,Sc22,Sc23,Sc31,Sc32,Sc33,Sc34,Sc35,Sc36,Sc41,Sc42,Sc43,Sc44,Sc45,Sc46,Sc47,Sc48,Sc51,Sc52,Sc53,Sc54,Sc55,Sc61,Sc62,Sc63,Sc64,Sc65,Sc66,Sc67
Name 1,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67
Name 2,11,12,13,14,21,22,23,31,32,33,34,35,36,41,42,43,44,45,46,47,48,51,52,53,54,55,61,62,63,64,65,66,67
\end{filecontents*}
% end code to create myfile.csv
\begin{document}
\newlength{\ncw}
\setlength{\ncw}{10.00mm}
\pgfplotstableread[col sep=comma]{myfile.csv}{\mytable}
\pgfplotstabletypeset[
begin table={\begin{longtable}},
begin table/.add={}{[l]},
every head row/.append style={before row={%
\hline
Name&\multicolumn{4}{c|}{Category 1}&\multicolumn{3}{c|}{Category 2}&\multicolumn{6}{c|}{Category 3}\\
\hline
\endhead
},
},
after row=\hline,
end table={\end{longtable}},
every first column/.style={column type={|l|}},
every column/.style={column type={C{\ncw}|}},
columns={[index]0,[index]1,[index]2,[index]3,[index]4,[index]5,
[index]6,[index]7,[index]8,[index]9,[index]10,[index]11,
[index]12,[index]13},
string type
]{\mytable}
\pgfplotstabletypeset[
begin table={\begin{longtable}},
begin table/.add={}{[l]},
every head row/.append style={before row={%
\hline
Name&\multicolumn{8}{|c|}{Category 4}&\multicolumn{5}{c|}{Category 5}&\multicolumn{7}{c|}{Category 6}\\
\hline
\endhead
},
},
after row=\hline,
end table={\end{longtable}},
every first column/.style={column type={|l|}},
every column/.style={column type={C{\ncw}|}},
columns={[index]0,[index]14,[index]15,[index]16,[index]17,[index]18,
[index]19,[index]20,[index]21,[index]22,[index]23,[index]24,
[index]25,[index]26,[index]27,[index]28,[index]29,[index]30,[index]31,
[index]32,[index]33},
string type
]{\mytable}
\end{document}