如何使我导出的 PDF 可搜索?

如何使我导出的 PDF 可搜索?

我使用 LyX 上的 pdflatex 将一份英文文档导出为 PDF。生成的 PDF 不可搜索(尽管我以图形形式包含的所有外部嵌入 PDF可搜索)。

另外,当我选择并复制文本时,粘贴时它会显示为“✇❤✐❝❤ ✐s ❞❡✜♥❡❞”。

如何让我生成的 PDF 可搜索?

在“使用 LaTeX 字体编码:”下,它显示“T1”,并且 LyX 文件是:

#LyX 2.0 created this file. For more info see http://www.lyx.org/
\lyxformat 413
\begin_document
\begin_header
\textclass article
\begin_preamble
\usepackage{cancel}
\end_preamble
\use_default_options true
\maintain_unincluded_children false
\language canadian
\language_package default
\inputencoding auto
\fontencoding global
\font_roman default
\font_sans default
\font_typewriter default
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100

\graphics default
\default_output_format default
\output_sync 0
\bibtex_command default
\index_command default
\paperfontsize default
\spacing single
\use_hyperref false
\papersize a4paper
\use_geometry true
\use_amsmath 1
\use_esint 1
\use_mhchem 1
\use_mathdots 1
\cite_engine basic
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\use_refstyle 1
\index Index
\shortcut idx
\color #008000
\end_index
\leftmargin 3cm
\topmargin 2cm
\rightmargin 3cm
\bottommargin 2cm
\headheight 1cm
\headsep 1cm
\footskip 1cm
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\quotes_language english
\papercolumns 1
\papersides 2
\paperpagestyle fancy
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header

\begin_body

\begin_layout Standard
This is some text
\end_layout

\end_body
\end_document

LaTeX 日志为:

This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/Debian) (format=pdflatex 2015.6.21)  30 SEP 2015 22:53
entering extended mode
 restricted \write18 enabled.
 %&-line parsing enabled.
**newfile1.tex
(./newfile1.tex
LaTeX2e <2011/06/27>
Babel <3.9h> and hyphenation patterns for 12 languages loaded.

(/usr/share/texlive/texmf-dist/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo
File: size10.clo 2007/10/19 v1.4h Standard LaTeX file (size option)
)
\c@part=\count79
\c@section=\count80
\c@subsection=\count81
\c@subsubsection=\count82
\c@paragraph=\count83
\c@subparagraph=\count84
\c@figure=\count85
\c@table=\count86
\abovecaptionskip=\skip41
\belowcaptionskip=\skip42
\bibindent=\dimen102
) (/usr/share/texlive/texmf-dist/tex/latex/base/fontenc.sty
Package: fontenc 2005/09/27 v1.99g Standard LaTeX package
(/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.def
File: t1enc.def 2005/09/27 v1.99g Standard LaTeX file
LaTeX Font Info:    Redeclaring font encoding T1 on input line 43.
)) (/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty
Package: inputenc 2008/03/30 v1.1d Input encoding file
\inpenc@prehook=\toks14
\inpenc@posthook=\toks15
(/usr/share/texlive/texmf-dist/tex/latex/base/latin9.def
File: latin9.def 2008/03/30 v1.1d Input encoding file
)) (/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty
Package: geometry 2010/09/12 v5.6 Page Geometry
(/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty
Package: keyval 1999/03/16 v1.13 key=value parser (DPC)
\KV@toks@=\toks16
) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty
Package: ifpdf 2011/01/30 v2.3 Provides the ifpdf switch (HO)
Package ifpdf Info: pdfTeX in PDF mode is detected.
) (/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty
Package: ifvtex 2010/03/01 v1.5 Detect VTeX and its facilities (HO)
Package ifvtex Info: VTeX not detected.
) (/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty
Package: ifxetex 2010/09/12 v0.6 Provides ifxetex conditional
)
\Gm@cnth=\count87
\Gm@cntv=\count88
\c@Gm@tempcnt=\count89
\Gm@bindingoffset=\dimen103
\Gm@wd@mp=\dimen104
\Gm@odd@mp=\dimen105
\Gm@even@mp=\dimen106
\Gm@layoutwidth=\dimen107
\Gm@layoutheight=\dimen108
\Gm@layouthoffset=\dimen109
\Gm@layoutvoffset=\dimen110
\Gm@dimlist=\toks17
) (/usr/share/texlive/texmf-dist/tex/latex/fancyhdr/fancyhdr.sty
\fancy@headwidth=\skip43
\f@ncyO@elh=\skip44
\f@ncyO@erh=\skip45
\f@ncyO@olh=\skip46
\f@ncyO@orh=\skip47
\f@ncyO@elf=\skip48
\f@ncyO@erf=\skip49
\f@ncyO@olf=\skip50
\f@ncyO@orf=\skip51
) (/usr/share/texlive/texmf-dist/tex/latex/cancel/cancel.sty
Package: cancel 2013/04/12 v2.2 Cancel math terms
) (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.sty
Package: babel 2013/12/03 3.9h The Babel package
(/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf
Language: english 2012/08/20 v3.3p English support from the babel system
(/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def
File: babel.def 2013/12/03 3.9h Babel common definitions
\babel@savecnt=\count90
\U@D=\dimen111
)
\l@canadian = a dialect from \language\l@american 
\l@australian = a dialect from \language\l@british 
\l@newzealand = a dialect from \language\l@british 
))
No file newfile1.aux.
\openout1 = `newfile1.aux'.

LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 20.
LaTeX Font Info:    ... okay on input line 20.
LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 20.
LaTeX Font Info:    ... okay on input line 20.
LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 20.
LaTeX Font Info:    ... okay on input line 20.
LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 20.
LaTeX Font Info:    ... okay on input line 20.
LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 20.
LaTeX Font Info:    ... okay on input line 20.
LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 20.
LaTeX Font Info:    ... okay on input line 20.
*geometry* driver: auto-detecting
*geometry* detected driver: pdftex

Package geometry Warning: The marginal notes overrun the paper.
     Add 46.64174pt and more to the right margin.

*geometry* verbose mode - [ preamble ] result:
* driver: pdftex
* paper: a4paper
* layout: <same size as paper>
* layoutoffset:(h,v)=(0.0pt,0.0pt)
* modes: twoside 
* h-part:(L,W,R)=(85.35826pt, 426.79135pt, 85.35826pt)
* v-part:(T,H,B)=(56.9055pt, 731.23584pt, 56.9055pt)
* \paperwidth=597.50787pt
* \paperheight=845.04684pt
* \textwidth=426.79135pt
* \textheight=731.23584pt
* \oddsidemargin=13.08827pt
* \evensidemargin=13.08827pt
* \topmargin=-72.26997pt
* \headheight=28.45274pt
* \headsep=28.45274pt
* \topskip=10.0pt
* \footskip=28.45274pt
* \marginparwidth=121.0pt
* \marginparsep=11.0pt
* \columnsep=10.0pt
* \skip\footins=9.0pt plus 4.0pt minus 2.0pt
* \hoffset=0.0pt
* \voffset=0.0pt
* \mag=1000
* \@twocolumnfalse
* \@twosidetrue
* \@mparswitchtrue
* \@reversemarginfalse
* (1in=72.27pt=25.4mm, 1cm=28.453pt)

[1

{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}] (./newfile1.aux) ) 
Here is how much of TeX's memory you used:
 1517 strings out of 494724
 18826 string characters out of 6174697
 76456 words of memory out of 5000000
 4899 multiletter control sequences out of 15000+600000
 5222 words of font info for 16 fonts, out of 8000000 for 9000
 102 hyphenation exceptions out of 8191
 31i,10n,43p,227b,230s stack positions out of 5000i,500n,10000p,200000b,80000s
 </home/
user/.texmf-var/fonts/pk/ljfour/jknappen/ec/ecrm1000.600pk>
Output written on newfile1.pdf (1 page, 3877 bytes).
PDF statistics:
 22 PDF objects out of 1000 (max. 8388607)
 8 compressed objects within 1 object stream
 0 named destinations out of 1000 (max. 500000)
 1 words of extra memory for PDF output out of 10000 (max. 10000000)

链接至 pdf

答案1

首先,您实际上是在制作嵌入的 pdf 图片,因此它们无法被搜索。

你需要做的是使用类似pdf页面包含您的文件。

例如:

\usepackage{pdfpages}
\includepdf[pages={-}]{yourfile.pdf}

这将包括文档中的所有页面(因为范围字符“-”没有开始或结束值)。如果您只想将几页放入新的 pdf 中,那么您可以使用pages={1,2,9}例如。

相关内容