使用 ImageMagick 或 Ghostscript 将 PDF 转换为图像时出现重音问题

使用 ImageMagick 或 Ghostscript 将 PDF 转换为图像时出现重音问题

我在使用 ImageMagick 或 Ghostscript 将 PDF 转换为图像时遇到了问题。转换后的图像中所有带重音符号的字符都消失了。我发现有几个人遇到了同样的问题,显然更新 imagemagick 包和 ghostcript 可以解决这个问题,但对我来说没用。

我在进行的每个测试中都使用这个 PDF 文件:https://www.dropbox.com/s/3gso0sw1e1n8f9r/error-with-accents.pdf?dl=0

我在 Azure 上有一个 Ubuntu 14.04.2 LTS 服务器,我需要 ImageMagick 才能工作。在官方存储库中,我有 ImageMagick 6.7.7 和 Ghostscript 9.10。后来,我尝试升级它们以修复我的问题,现在我在/opt/imagemagick-6.8文件夹中也运行了 ImageMagick 6.8.9-10,并且我添加了 Ubuntu 的 15.04 存储库,这样我就可以直接通过 apt-get 安装 Ghostscript 9.15。这些都没有为我解决问题。

以下是我在 Ubuntu 14.04 服务器上的最新尝试:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.2 LTS
Release:    14.04
Codename:   trusty

$ /opt/imagemagick-6.8/bin/convert -version
Version: ImageMagick 6.8.9-10 Q16 x86_64 2015-07-30 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2014 ImageMagick Studio LLC
Features: DPC OpenMP
Delegates: jng jpeg png x xml zlib

$ /opt/imagemagick-6.8/bin/convert -list configure |grep DELEGATES
DELEGATES      mpeg jng jpeg png ps x xml zlib

$ /opt/imagemagick-6.8/bin/convert error-with-accents.pdf -verbose -alpha off -resample 150 -density 150 -quality '80' im-test.jpg
   **** Warning: considering '0000000000 XXXXX n' as a free entry.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> Mac OS X 10.10.4 Quartz PDFContext <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

error-with-accents.pdf=>im-test.jpg PDF 595x794=>1240x1654 1240x1654+0+0 16-bit sRGB 172KB 0.440u 0:00.240

$ gs -v
GPL Ghostscript 9.15 (2014-09-22)
Copyright (C) 2014 Artifex Software, Inc.  All rights reserved.

$ gs -dBATCH -dNOPAUSE -sDEVICE=jpeg -sOutputFile=gs-test.jpg error-with-accents.pdf 
GPL Ghostscript 9.15 (2014-09-22)
Copyright (C) 2014 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
   **** Warning: considering '0000000000 XXXXX n' as a free entry.
Processing pages 1 through 1.
Page 1

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> Mac OS X 10.10.4 Quartz PDFContext <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

$ convert -version
Version: ImageMagick 6.7.7-10 2014-03-06 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2012 ImageMagick Studio LLC
Features: OpenMP    

$ convert -list configure |grep DELEGATES
DELEGATES     bzlib djvu fftw fontconfig freetype jbig jpeg jng jp2 lcms2 lqr lzma openexr pango png rsvg tiff x11 xml wmf zlib

$ convert error-with-accents.pdf -verbose -alpha off -resample 150 -density 150 -quality '80' im-test-6.7.7.jpg
   **** Warning: considering '0000000000 XXXXX n' as a free entry.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> Mac OS X 10.10.4 Quartz PDFContext <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

error-with-accents.pdf=>im-test-6.7.7.jpg PDF 595x794=>1240x1654 1240x1654+0+0 16-bit DirectClass 160KB 0.490u 0:00.279

全部结果相同:
https://www.dropbox.com/s/eob6y234x37s864/gs-test.jpg?dl=0
https://www.dropbox.com/s/96z1pkksdn1dpr4/im-test.jpg?dl=0
https://www.dropbox.com/s/dev0kbza2c8v2gf/im-test-6.7.7.jpg?dl=0

我可以在 Mac OS 上正确运行 Ghostscript 和 ImageMagick。并且,根据这个帖子,我在 Ubuntu 上安装的版本应该可以正常工作。所以我认为这与 FreeType 字体有关。我不知道如何修复这个问题。有什么帮助吗?

答案1

谢谢Stackoverflow 上的 Kurt Pfeifle寻找答案。

问题出在服务器上安装的 Ghostscript 版本。由于 Ubuntu Wily 存储库中 Ghostscript 的最新版本是 9.15,因此我下载了官方适用于 Linux x64 的二进制包在 Ghostscript 网站上。

然后我/usr/bin/gs用包中的二进制文件替换了二进制文件,一切正常。不再出现重音问题。

答案2

当我尝试打印带有重音符号的 PDF 时,我遇到了同样的问题。我得出的结论是,这是 ghostscript 的问题,因为 CUPS 在通过gstoraster过滤器对 PDF 进行栅格化时使用了它。我还意识到,最近在独立模式下运行的 ghostscript 二进制文件运行良好。

我不建议替代,/usr/bin/gs因为它可能会破坏一些依赖关系(例如 CUPS)!

相反,我建议你看pdfimages一下poppler-utils

相关内容