我需要合并一些打瞌睡的pdf,并且我希望所有输入pdf都从输出pdf中的奇数页开始。
示例:A.pdf
有 3 页、B.pdf
有 4 页。我不想要我的输出有 7 页。我想要的是一个 8 页的 pdf,其中第 1-3 页来自A.pdf
,第 4 页是空的,第 5-8 页来自B.pdf
。我怎样才能做到这一点?
我知道 pdftk,但我在手册页中没有找到这样的选项。
答案1
这pypdf库如果您愿意编写一些 Python 代码,那么这类事情就会变得简单。将下面的代码保存在名为pdf-cat-even
(或任何您喜欢的内容)的脚本中,使其可执行(chmod +x pdf-cat-even
),并运行它并将输出重定向到文件./pdf-cat-even a.pdf b.pdf >concatenated.pdf
- 当前版本的 pypdf 不支持写入管道)。
#!/usr/bin/env python3
import copy, sys
from pypdf import PdfWriter, PdfReader
output = PdfWriter()
output_page_number = 0
alignment = 2 # to align on even pages
for filename in sys.argv[1:]:
# This code is executed for every file in turn
input = PdfReader(filename)
for p in input.pages:
# This code is executed for every input page in turn
output.add_page(p)
output_page_number += 1
while output_page_number % alignment != 0:
output.add_blank_page()
output_page_number += 1
output.write(sys.stdout.buffer)
答案2
第一步是生成一个空白页面的 pdf 文件。您可以使用许多程序轻松完成此操作(LibreOffice/OpenOffice、inkscape、(La)TeX、scribus 等)
然后只需在需要的地方包含这个空白页面即可:
pdftk A.pdf empty_page.pdf B.pdf output result.pdf
如果您想使用脚本自动执行此操作,您可以使用egpdftk file.pdf dump_data | grep NumberOfPages | egrep -o '[0-9]*'
来提取页数。
答案3
吉尔斯的回答为我工作,但由于我必须合并许多文件,如果我可以从文本文件中读取它们的名称会更方便。我稍微修改了吉尔斯的代码来做到这一点,也许它会帮助其他人:
#!/usr/bin/env python
# requires PyPdf library, version 1.13 or above -
# its homepage is http://pybrary.net/pyPdf/
# running: ./this-script-name file-with-pdf-list > output.pdf
import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0
# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2
listoffiles = open(sys.argv[1]).read().splitlines()
for filename in listoffiles:
# This code is executed for every file in turn
input = PdfFileReader(open(filename))
for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
# This code is executed for every input page in turn
output.addPage(p)
output_page_number += 1
while output_page_number % alignment != 0:
output.addBlankPage()
output_page_number += 1
output.write(sys.stdout)
答案4
这是 PyPDF2 和 python3 的代码
#!/usr/bin/env python
# requires PyPdf2 library, version 1.26 or above -
# its homepage is https://pythonhosted.org/PyPDF2/index.html
# running: ./this-script-name output.pdf file-with-pdf-list
import copy, sys
from PyPDF2 import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0
# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2
for filename in sys.argv[2:]:
# This code is executed for every file in turn
input = PdfFileReader(open(filename, "rb"))
output.appendPagesFromReader(input)
output_page_number += input.getNumPages()
while output_page_number % alignment != 0:
output.addBlankPage()
output_page_number += 1
output.write(open(sys.argv[1], "wb"))