网络上的一些研究论文质量较差。例如,参见这。好像是从 PS 文件转换过来的,用 (La)TeX 排版的。不过字体信息都有。有没有什么办法能让 PDF 文件更清晰一些?
答案1
1. 下载 PostScript 输出(dvips
输出)
尝试获取 PostScript 版本,至少缓存版本似乎有效。
该文件是dvips
用位图字体生成的。因此质量“较差”。
2. 分析 PostScript 文件
在页面顶部我们发现:
%%Creator: dvips 5.47 Copyright 1986-91 Radical Eye Software
这太老了pkfix
。这需要字体注释,由 5.58 或更高版本编写。如果有字体注释,如
%DVIPSBitmapFont: Fa cmr7 7 6
那么您可以跳过步骤 3,直接进入步骤 4。
pkfix-helper
3. 使用for准备 PostScriptpkfix
因此,这是一项工作pkfix-helper
它尝试使用启发式方法(字符框)识别字体,并将缺失的字体注释写入 PostScript 文件中pkfix
。
$ pkfix-helper crt.ps crt-helper.ps
发现:
Reading crt.ps ... done.
Number of Type 3 fonts encountered in included documents: 0
Total number of Type 3 fonts encountered: 21
pkfix-helper: Could not determine the target printer resolution; assuming 300 DPI
Finding character widths ... done.
Reading TFM files ... done (103 TFMs in 193 scaling variations).
Matching fonts:
Processing Fp ... done (cmr10 @ 1X, mismatch=0.49957).
Processing Fq ... done (cmr9 @ 1X, mismatch=0.27830).
Processing Fi ... done (cmbx10 @ 1X, mismatch=0.34628).
Processing Fo ... done (cmti10 @ 1X, mismatch=0.17562).
Processing Fe ... done (cmr7 @ 1X, mismatch=0.40178).
Processing Fr ... done (cmssbx10 @ 1X, mismatch=0.24687).
Processing Fl ... done (cmmi10 @ 1X, mismatch=0.23339).
Processing Fn ... done (cmr10 @ 1X, mismatch=0.24939).
Processing Fm ... done (cmsy10 @ 1X, mismatch=0.13897).
Processing Fg ... done (cmti9 @ 1X, mismatch=0.15207).
Processing Fj ... done (cmmi7 @ 1X, mismatch=0.08061).
Processing Fs ... done (cmss10 @ 1.2X, mismatch=0.25618).
Processing Fb ... done (cmbx7 @ 1X, mismatch=0.06942).
Processing Ft ... done (cmbx12 @ 1.2X, mismatch=5.94738).
pkfix-helper: Best match for Ft is rather poor
Processing Fa ... done (cmr7 @ 1X, mismatch=0.07234).
Processing Fh ... done (cmsy7 @ 1X, mismatch=0.01447).
Processing Fc ... done (cmmi5 @ 1X, mismatch=0.00759).
Processing Fd ... done (lasy10 @ 1X, mismatch=0.00005).
Processing Ff ... done (cmex8 @ 1X, mismatch=0.00482).
Processing Fk ... done (cmex10 @ 1X, mismatch=0.00181).
Processing Fu ... done (cmbxti12 @ 2.0733X, mismatch=43.70349).
pkfix-helper: Best match for Fu is rather poor
认真对待警告,字体Ft
和Fu
应该被排除。使用位图字体比使用错误的字体要好。
$ pkfix-helper -k Fu -k Ft crt.ps crt-helper.ps
Reading crt.ps ... done.
Number of Type 3 fonts encountered in included documents: 0
Total number of Type 3 fonts encountered: 21
pkfix-helper: Could not determine the target printer resolution; assuming 300 DPI
Finding character widths ... done.
Reading TFM files ... done (103 TFMs in 193 scaling variations).
Matching fonts:
Processing Fp ... done (cmr10 @ 1X, mismatch=0.49957).
Processing Fq ... done (cmr9 @ 1X, mismatch=0.27830).
Processing Fi ... done (cmbx10 @ 1X, mismatch=0.34628).
Processing Fo ... done (cmti10 @ 1X, mismatch=0.17562).
Processing Fe ... done (cmr7 @ 1X, mismatch=0.40178).
Processing Fr ... done (cmssbx10 @ 1X, mismatch=0.24687).
Processing Fl ... done (cmmi10 @ 1X, mismatch=0.23339).
Processing Fn ... done (cmr10 @ 1X, mismatch=0.24939).
Processing Fm ... done (cmsy10 @ 1X, mismatch=0.13897).
Processing Fg ... done (cmti9 @ 1X, mismatch=0.15207).
Processing Fj ... done (cmmi7 @ 1X, mismatch=0.08061).
Processing Fs ... done (cmss10 @ 1.2X, mismatch=0.25618).
Processing Fb ... done (cmbx7 @ 1X, mismatch=0.06942).
Retaining Ft as a bitmapped font.
Processing Fa ... done (cmr7 @ 1X, mismatch=0.07234).
Processing Fh ... done (cmsy7 @ 1X, mismatch=0.01447).
Processing Fc ... done (cmmi5 @ 1X, mismatch=0.00759).
Processing Fd ... done (lasy10 @ 1X, mismatch=0.00005).
Processing Ff ... done (cmex8 @ 1X, mismatch=0.00482).
Processing Fk ... done (cmex10 @ 1X, mismatch=0.00181).
Retaining Fu as a bitmapped font.
4. 使用pkfix
现在文件crt-helper.ps
处理如下pkfix
:
$ pkfix crt-helper.ps crt-fixed.ps
PKFIX 1.7, 2012/04/18 - Copyright (c) 2001, 2005, 2007, 2009, 2011, 2012 by Heiko Oberdiek.
*** Font conversion: `cmr7' -> `CMR7'.
*** Font conversion: `cmbx7' -> `CMBX7'.
*** Font conversion: `cmmi5' -> `CMMI5'.
*** Font conversion: `lasy10' -> `LASY10'.
*** Font conversion: `cmr7' -> `CMR7'.
*** Font conversion: `cmex8' -> `CMEX8'.
*** Font conversion: `cmti9' -> `CMTI9'.
*** Font conversion: `cmsy7' -> `CMSY7'.
*** Font conversion: `cmbx10' -> `CMBX10'.
*** Font conversion: `cmmi7' -> `CMMI7'.
*** Font conversion: `cmex10' -> `CMEX10'.
*** Font conversion: `cmmi10' -> `CMMI10'.
*** Font conversion: `cmsy10' -> `CMSY10'.
*** Font conversion: `cmr10' -> `CMR10'.
*** Font conversion: `cmti10' -> `CMTI10'.
*** Font conversion: `cmr10' -> `CMR10'.
*** Font conversion: `cmr9' -> `CMR9'.
*** Font conversion: `cmssbx10' -> `CMSSBX10'.
*** Font conversion: `cmss10' -> `CMSS10'.
*** Merging font `CMR7' (2).
*** Merging font `CMR10' (2).
==> 19 converted fonts.
==> 2 merged fonts.
5. 转换为 PDF
并将生成的 PS 转换为 PDF:
$ ps2pdf crt-fixed.ps
6.检查PDF文件的字体
PDF 主要包含 Type 1(矢量)字体:
$ pdffonts crt-fixed.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
EXRKAF+CMR10 Type 1C Custom yes yes no 31 0
XYBRJB+CMR9 Type 1C Custom yes yes no 29 0
AIEJHE+CMSSBX10 Type 1C WinAnsi yes yes no 27 0
VRVPVC+CMSS10 Type 1C WinAnsi yes yes no 25 0
[none] Type 3 Custom yes no no 17 0
[none] Type 3 Custom yes no no 9 0
ECQATD+CMTI10 Type 1C Custom yes yes no 38 0
GUCXJN+CMMI7 Type 1C Custom yes yes no 46 0
QSBVLU+CMEX10 Type 1C Custom yes yes no 44 0
KLJBVV+CMMI10 Type 1C Custom yes yes no 42 0
WMLJNK+CMSY10 Type 1C Custom yes yes yes 40 0
NXXIQZ+CMTI9 Type 1C WinAnsi yes yes no 57 0
GWEEWQ+CMSY7 Type 1C Custom yes yes yes 55 0
VSMLLB+CMBX10 Type 1C Custom yes yes no 53 0
IQGMWX+LASY10 Type 1C Custom yes yes yes 73 0
TOJKBS+CMR7 Type 1C Custom yes yes no 71 0
OTLROK+CMEX8 Type 1C Custom yes yes no 69 0
Times-Roman Type 1 Standard no no no 120 0
MQZZBQ+CMBX7 Type 1C WinAnsi yes yes no 133 0
GROEMQ+CMMI5 Type 1C WinAnsi yes yes no 131 0
警告:也不要忘记检查视觉外观。 的启发式方法pkfix-helper
可能发现了错误的字体, 可能无法识别某些内容pkfix
,某些程序中存在错误,或者……
比较
根据要求,提供两组 AR9/Linux 的屏幕截图,放大率分别为 200% 和 650%。第一组来自原始 PDF 文件(缓存版本):
第二组数据取自固定 PDF 文件中的相同区域(如上所示):