使低质量的 PDF 文件更清晰

使低质量的 PDF 文件更清晰

网络上的一些研究论文质量较差。例如,参见。好像是从 PS 文件转换过来的,用 (La)TeX 排版的。不过字体信息都有。有没有什么办法能让 PDF 文件更清晰一些?

答案1

1. 下载 PostScript 输出(dvips输出)

尝试获取 PostScript 版本,至少缓存版本似乎有效。

该文件是dvips用位图字体生成的。因此质量“较差”。

2. 分析 PostScript 文件

在页面顶部我们发现:

%%Creator: dvips 5.47 Copyright 1986-91 Radical Eye Software

这太老了pkfix。这需要字体注释,由 5.58 或更高版本编写。如果有字体注释,如

%DVIPSBitmapFont: Fa cmr7 7 6

那么您可以跳过步骤 3,直接进入步骤 4。

pkfix-helper3. 使用for准备 PostScriptpkfix

因此,这是一项工作pkfix-helper它尝试使用启发式方法(字符框)识别字体,并将缺失的字体注释写入 PostScript 文件中pkfix

$ pkfix-helper crt.ps crt-helper.ps

发现:

Reading crt.ps ... done.
Number of Type 3 fonts encountered in included documents: 0
Total number of Type 3 fonts encountered: 21
pkfix-helper: Could not determine the target printer resolution; assuming 300 DPI
Finding character widths ... done.
Reading TFM files ... done (103 TFMs in 193 scaling variations).
Matching fonts:
    Processing Fp ... done (cmr10 @ 1X, mismatch=0.49957).
    Processing Fq ... done (cmr9 @ 1X, mismatch=0.27830).
    Processing Fi ... done (cmbx10 @ 1X, mismatch=0.34628).
    Processing Fo ... done (cmti10 @ 1X, mismatch=0.17562).
    Processing Fe ... done (cmr7 @ 1X, mismatch=0.40178).
    Processing Fr ... done (cmssbx10 @ 1X, mismatch=0.24687).
    Processing Fl ... done (cmmi10 @ 1X, mismatch=0.23339).
    Processing Fn ... done (cmr10 @ 1X, mismatch=0.24939).
    Processing Fm ... done (cmsy10 @ 1X, mismatch=0.13897).
    Processing Fg ... done (cmti9 @ 1X, mismatch=0.15207).
    Processing Fj ... done (cmmi7 @ 1X, mismatch=0.08061).
    Processing Fs ... done (cmss10 @ 1.2X, mismatch=0.25618).
    Processing Fb ... done (cmbx7 @ 1X, mismatch=0.06942).
    Processing Ft ... done (cmbx12 @ 1.2X, mismatch=5.94738).
pkfix-helper: Best match for Ft is rather poor
    Processing Fa ... done (cmr7 @ 1X, mismatch=0.07234).
    Processing Fh ... done (cmsy7 @ 1X, mismatch=0.01447).
    Processing Fc ... done (cmmi5 @ 1X, mismatch=0.00759).
    Processing Fd ... done (lasy10 @ 1X, mismatch=0.00005).
    Processing Ff ... done (cmex8 @ 1X, mismatch=0.00482).
    Processing Fk ... done (cmex10 @ 1X, mismatch=0.00181).
    Processing Fu ... done (cmbxti12 @ 2.0733X, mismatch=43.70349).
pkfix-helper: Best match for Fu is rather poor

认真对待警告,字体FtFu应该被排除。使用位图字体比使用错误的字体要好。

$ pkfix-helper -k Fu -k Ft crt.ps crt-helper.ps
Reading crt.ps ... done.
Number of Type 3 fonts encountered in included documents: 0
Total number of Type 3 fonts encountered: 21
pkfix-helper: Could not determine the target printer resolution; assuming 300 DPI
Finding character widths ... done.
Reading TFM files ... done (103 TFMs in 193 scaling variations).
Matching fonts:
    Processing Fp ... done (cmr10 @ 1X, mismatch=0.49957).
    Processing Fq ... done (cmr9 @ 1X, mismatch=0.27830).
    Processing Fi ... done (cmbx10 @ 1X, mismatch=0.34628).
    Processing Fo ... done (cmti10 @ 1X, mismatch=0.17562).
    Processing Fe ... done (cmr7 @ 1X, mismatch=0.40178).
    Processing Fr ... done (cmssbx10 @ 1X, mismatch=0.24687).
    Processing Fl ... done (cmmi10 @ 1X, mismatch=0.23339).
    Processing Fn ... done (cmr10 @ 1X, mismatch=0.24939).
    Processing Fm ... done (cmsy10 @ 1X, mismatch=0.13897).
    Processing Fg ... done (cmti9 @ 1X, mismatch=0.15207).
    Processing Fj ... done (cmmi7 @ 1X, mismatch=0.08061).
    Processing Fs ... done (cmss10 @ 1.2X, mismatch=0.25618).
    Processing Fb ... done (cmbx7 @ 1X, mismatch=0.06942).
    Retaining Ft as a bitmapped font.
    Processing Fa ... done (cmr7 @ 1X, mismatch=0.07234).
    Processing Fh ... done (cmsy7 @ 1X, mismatch=0.01447).
    Processing Fc ... done (cmmi5 @ 1X, mismatch=0.00759).
    Processing Fd ... done (lasy10 @ 1X, mismatch=0.00005).
    Processing Ff ... done (cmex8 @ 1X, mismatch=0.00482).
    Processing Fk ... done (cmex10 @ 1X, mismatch=0.00181).
    Retaining Fu as a bitmapped font.

4. 使用pkfix

现在文件crt-helper.ps处理如下pkfix

$ pkfix crt-helper.ps crt-fixed.ps
PKFIX 1.7, 2012/04/18 - Copyright (c) 2001, 2005, 2007, 2009, 2011, 2012 by Heiko Oberdiek.
*** Font conversion: `cmr7' -> `CMR7'.
*** Font conversion: `cmbx7' -> `CMBX7'.
*** Font conversion: `cmmi5' -> `CMMI5'.
*** Font conversion: `lasy10' -> `LASY10'.
*** Font conversion: `cmr7' -> `CMR7'.
*** Font conversion: `cmex8' -> `CMEX8'.
*** Font conversion: `cmti9' -> `CMTI9'.
*** Font conversion: `cmsy7' -> `CMSY7'.
*** Font conversion: `cmbx10' -> `CMBX10'.
*** Font conversion: `cmmi7' -> `CMMI7'.
*** Font conversion: `cmex10' -> `CMEX10'.
*** Font conversion: `cmmi10' -> `CMMI10'.
*** Font conversion: `cmsy10' -> `CMSY10'.
*** Font conversion: `cmr10' -> `CMR10'.
*** Font conversion: `cmti10' -> `CMTI10'.
*** Font conversion: `cmr10' -> `CMR10'.
*** Font conversion: `cmr9' -> `CMR9'.
*** Font conversion: `cmssbx10' -> `CMSSBX10'.
*** Font conversion: `cmss10' -> `CMSS10'.
*** Merging font `CMR7' (2).
*** Merging font `CMR10' (2).
==> 19 converted fonts.
==> 2 merged fonts.

5. 转换为 PDF

并将生成的 PS 转换为 PDF:

$ ps2pdf crt-fixed.ps

6.检查PDF文件的字体

PDF 主要包含 Type 1(矢量)字体:

$ pdffonts crt-fixed.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
EXRKAF+CMR10                         Type 1C           Custom           yes yes no      31  0
XYBRJB+CMR9                          Type 1C           Custom           yes yes no      29  0
AIEJHE+CMSSBX10                      Type 1C           WinAnsi          yes yes no      27  0
VRVPVC+CMSS10                        Type 1C           WinAnsi          yes yes no      25  0
[none]                               Type 3            Custom           yes no  no      17  0
[none]                               Type 3            Custom           yes no  no       9  0
ECQATD+CMTI10                        Type 1C           Custom           yes yes no      38  0
GUCXJN+CMMI7                         Type 1C           Custom           yes yes no      46  0
QSBVLU+CMEX10                        Type 1C           Custom           yes yes no      44  0
KLJBVV+CMMI10                        Type 1C           Custom           yes yes no      42  0
WMLJNK+CMSY10                        Type 1C           Custom           yes yes yes     40  0
NXXIQZ+CMTI9                         Type 1C           WinAnsi          yes yes no      57  0
GWEEWQ+CMSY7                         Type 1C           Custom           yes yes yes     55  0
VSMLLB+CMBX10                        Type 1C           Custom           yes yes no      53  0
IQGMWX+LASY10                        Type 1C           Custom           yes yes yes     73  0
TOJKBS+CMR7                          Type 1C           Custom           yes yes no      71  0
OTLROK+CMEX8                         Type 1C           Custom           yes yes no      69  0
Times-Roman                          Type 1            Standard         no  no  no     120  0
MQZZBQ+CMBX7                         Type 1C           WinAnsi          yes yes no     133  0
GROEMQ+CMMI5                         Type 1C           WinAnsi          yes yes no     131  0

警告:也不要忘记检查视觉外观。 的启发式方法pkfix-helper可能发现了错误的字体, 可能无法识别某些内容pkfix,某些程序中存在错误,或者……

比较

根据要求,提供两组 AR9/Linux 的屏幕截图,放大率分别为 200% 和 650%。第一组来自原始 PDF 文件(缓存版本)

crt.pdf 的屏幕截图

crt 方程.pdf

第二组数据取自固定 PDF 文件中的相同区域(如上所示):

crt-fixed.pdf 的屏幕截图

crt-fixed 方程.pdf

相关内容