我有大约 1000 份不同文档的正面和背面扫描件,分别放在 2 个不同的文件夹中。我希望创建一个批处理操作,将正面扫描件和其对应的背面扫描件合并为一个文档。
编辑:我使用的是 Windows XP,扫描件是 PDF。封面在一个文件夹中,封底在另一个文件夹中。文件名为 1-1-NAME、1-2-NAME;其中 NAME 是四个字母的标识符。
答案1
你在寻找 ImageMagick 的剪辑? ImageMagick 可以处理 pdf。
如果你想要比 ImageMagick montage 更灵活的功能,你也可以在Python语言pyPdf库。pyPdf 可以合并 PDF 页面并应用基本变换(例如平移、旋转、缩放)。示例脚本:
import pyPdf
def merge_horizontal(out_filename, left_filename, right_filename):
""" Merge the first page of two PDFs side-to-side """
# open the PDF files to be merged
with open(left_filename) as left_file, open(right_filename) as right_file, open(out_filename, 'w') as output_file:
left_pdf = pyPdf.PdfFileReader(left_file)
right_pdf = pyPdf.PdfFileReader(right_file)
output = pyPdf.PdfFileWriter()
# get the first page from each pdf
left_page = left_pdf.pages[0]
right_page = right_pdf.pages[0]
# start a new blank page with a size that can fit the merged pages side by side
page = output.addBlankPage(
width=left_page.mediaBox.getWidth() + right_page.mediaBox.getWidth(),
height=max(left_page.mediaBox.getHeight(), right_page.mediaBox.getHeight()),
)
# draw the pages on that new page
page.mergeTranslatedPage(left_page, 0, 0)
page.mergeTranslatedPage(right_page, left_page.mediaBox.getWidth(), 0)
# write to file
output.write(output_file)
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as exc:
if not (exc.errno == errno.EEXIST and os.path.isdir(path)):
raise
if __name__ == '__main__':
import sys, os, errno
output_folder_name = sys.argv[1]
left_folder_name = sys.argv[2]
right_folder_name = sys.argv[3]
left_files = set(os.listdir(left_folder_name))
right_files = set(os.listdir(right_folder_name))
mkdir_p(output_folder_name)
# for every files that are in both left_files and right_files
for f in left_files.intersection(right_files):
output_file_name = os.path.join(output_folder_name, f)
left_file_name = os.path.join(left_folder_name, f)
right_file_name = os.path.join(right_folder_name, f)
print 'merging %s and %s into %s' % (left_file_name, right_file_name, output_file_name)
merge_horizontal(output_file_name, left_file_name, right_file_name)
# pair is missing, not merging
print 'Only in left folder: ', left_files - right_files
print 'Only in right folder: ', right_files - left_files
并像下面这样调用脚本:
python merge.py output_folder left_folder right_folder
示例输出:
merging folderA/two.pdf and folderB/two.pdf into output/dacd/adca/two.pdf
merging folderA/one.pdf and folderB/one.pdf into output/dacd/adca/one.pdf
merging folderA/three.pdf and folderB/three.pdf into output/dacd/adca/three.pdf
Only in left folder: set(['four.pdf'])
Only in right folder: set(['five.pdf'])