如何合并数据并创建多个文档？类似于个性化邮件

Question

由于我的工具箱中最强大的工具是 Python，所以我到处都能看到 Python 问题。（“我唯一的工具是一把锤子，所有东西看起来都像钉子问题”。）

这个想法如下：

数据文件，可以是 excel、csv、mysql-database 或任何你喜欢的文件
带有占位符的 LaTeX 模板，我使用了 @@key，因为文档中不应该出现两个 @，但您可以自己考虑。
一个 Python 脚本，用于填充数据中每一行的占位符并调用 LaTeX 来生成结果。

除了标准库之外，您只需要pandas数据处理部分。如果您想在下次学习 Python 并且具有科学背景（如果您使用 Mathematica，我假设您有科学背景），我建议使用 Anaconda 发行版安装 Python。它捆绑了几乎所有科学模块和预构建的依赖项：

http://continuum.io/downloads#py34

这是我的data.csv：

productID,firstname,lastname,date
1,Jules,Winnfield,2015-01-01
2,Vincent,Vega,2015-01-02

这是我的非常简单的template.tex：

\documentclass{scrartcl}

\usepackage{fontspec}

\begin{document}

\begin{itemize}
  \item @@productID
  \item @@firstname
  \item @@lastname
\end{itemize}

\end{document}

这是 Python 脚本，如果有不清楚的地方请询问：

# codecs provides input/output with specific encoding
import codecs
# os includes operating system operations
import os
# this is needed to call latex from python
from subprocess import call
# pandas for data munching
import pandas

# create folders, no error if it already exists
os.makedirs('tmp', exist_ok=True)
os.makedirs('output', exist_ok=True)

# read in the template:
with codecs.open('template.tex', encoding='utf8') as f:
    template = f.read()

data = pandas.read_csv('data.csv')
# show the first 5 rows in the data to have a quick look
print(data.head())

# these are the keys we want to replace with our data:
keys = [
    'productID',
    'firstname',
    'lastname',
]

# no we loop over each row to create a pdf with the
# data
for index, row in data.iterrows():
    filled = template
    for key in keys:
        # replace our placeholder with the actual data, cast to string first
        filled = filled.replace('@@' + key, str(row[key]))

    # create a hopefully unique filename
    filename = 'filled_{}_{}_{}'.format(
        row.lastname,
        row.firstname,
        row.date,
    )
    # now we write the filled template to the tmp folder
    with codecs.open('tmp/' + filename + '.tex', 'w', encoding='utf8') as f:
        f.write(filled)

    # and call lualatex or any other latex compiler
    # call takes a list of arguments
    call(['lualatex',
          '--interaction=batchmode',
          '--output-directory=tmp',
          'tmp/' + filename + '.tex',
          ])

    # there is a missing newline at the end of the latex call
    print('\n')

    # now move the file to the output folder:
    os.rename('tmp/' + filename + '.pdf', 'output/' + filename + '.pdf')

# now we delete the tmp folder
call(['rm', '-rf', 'tmp'])

pandas 还提供read_excel、read_sql_table等等read_sql_query：http://pandas.pydata.org/pandas-docs/stable/io.html

Answer 1