如何合并 pdf 文件,使每个文件从奇数页码开始?

如何合并 pdf 文件,使每个文件从奇数页码开始?

我需要合并一些打瞌睡的pdf,并且我希望所有输入pdf都从输出pdf中的奇数页开始。

示例:A.pdf有 3 页、B.pdf有 4 页。我不想要我的输出有 7 页。我想要的是一个 8 页的 pdf,其中第 1-3 页来自A.pdf,第 4 页是空的,第 5-8 页来自B.pdf。我怎样才能做到这一点?

我知道 pdftk,但我在手册页中没有找到这样的选项。

答案1

pypdf库如果您愿意编写一些 Python 代码,那么这类事情就会变得简单。将下面的代码保存在名为pdf-cat-even(或任何您喜欢的内容)的脚本中,使其可执行(chmod +x pdf-cat-even),并运行它并将输出重定向到文件./pdf-cat-even a.pdf b.pdf >concatenated.pdf- 当前版本的 pypdf 不支持写入管道)。

#!/usr/bin/env python3
import copy, sys
from pypdf import PdfWriter, PdfReader
output = PdfWriter()
output_page_number = 0
alignment = 2           # to align on even pages
for filename in sys.argv[1:]:
    # This code is executed for every file in turn
    input = PdfReader(filename)
    for p in input.pages:
        # This code is executed for every input page in turn
        output.add_page(p)
        output_page_number += 1
    while output_page_number % alignment != 0:
        output.add_blank_page()
        output_page_number += 1
output.write(sys.stdout.buffer)

答案2

第一步是生成一个空白页面的 pdf 文件。您可以使用许多程序轻松完成此操作(LibreOffice/OpenOffice、inkscape、(La)TeX、scribus 等)

然后只需在需要的地方包含这个空白页面即可:

pdftk A.pdf empty_page.pdf B.pdf output result.pdf 

如果您想使用脚本自动执行此操作,您可以使用egpdftk file.pdf dump_data | grep NumberOfPages | egrep -o '[0-9]*'来提取页数。

答案3

吉尔斯的回答为我工作,但由于我必须合并许多文件,如果我可以从文本文件中读取它们的名称会更方便。我稍微修改了吉尔斯的代码来做到这一点,也许它会帮助其他人:

#!/usr/bin/env python

# requires PyPdf library, version 1.13 or above -
# its homepage is http://pybrary.net/pyPdf/
# running: ./this-script-name file-with-pdf-list > output.pdf

import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0

# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2

listoffiles = open(sys.argv[1]).read().splitlines()
for filename in listoffiles:
    # This code is executed for every file in turn
    input = PdfFileReader(open(filename))
    for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
        # This code is executed for every input page in turn
        output.addPage(p)
        output_page_number += 1
    while output_page_number % alignment != 0:
        output.addBlankPage()
        output_page_number += 1
output.write(sys.stdout)

答案4

这是 PyPDF2 和 python3 的代码

#!/usr/bin/env python


# requires PyPdf2 library, version 1.26 or above -
# its homepage is https://pythonhosted.org/PyPDF2/index.html
# running: ./this-script-name output.pdf file-with-pdf-list

import copy, sys
from PyPDF2 import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0

# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2

for filename in sys.argv[2:]:
    # This code is executed for every file in turn
    input = PdfFileReader(open(filename, "rb"))
    output.appendPagesFromReader(input)
    output_page_number += input.getNumPages()

    while output_page_number % alignment != 0:
        output.addBlankPage()
        output_page_number += 1

output.write(open(sys.argv[1], "wb"))

相关内容