是否可以将每页的内容存储到单独的文件中

是否可以将每页的内容存储到单独的文件中

我的标签如下:

\documentclass{book}
\usepackage{showframe}

\begin{document}

Deep learning has transformed computer vision, natural language and speech processing in particular, and artificial intelligence in general. From a bag of semi-discordant tricks, none of which worked satisfactorily on real-life problems, artificial intelligence has become a formidable tool to solve real problems faced by industry, at scale. This is nothing short of a revolution going on under our very noses. To lead the curve of this revolution, it is imperative to understand the underlying principles and abstractions rather than simply memorizing the ``how-to'' steps of some hands-on guide. This is where mathematics comes in.

\section{First Level Head}

In this first chapter, we present an overview of deep learning. This will require us to use some concepts explained in subsequent chapters. Don't worry if there are some open questions at the end of this chapter: it is aimed at orienting your mind toward this difficult subject. As individual concepts become clearer in subsequent chapters, you should consider coming back and re-reading this chapter.

For instance, a cat's brain is often trying to choose between the following options:
\emph{run away} from the object in front of it vs.
\emph{ignore} the object in front of it vs. \emph{approach} the
object in front of it and purr. The cat's brain makes that decision
by processing sensory inputs like the perceived \emph{hardness} of
the object in front of it, the perceived \emph{sharpness} of the
object in front of it,  and so on.


This is an instance of a \emph{classification} problem, where the output is one of a set of possible classes.

Some other examples of classification problems in life are as follows:
\begin{itemize}
\item \emph{Buy} vs. \emph{hold} vs. \emph{sell}
a certain stock, from inputs like the
\emph{price history of this stock} and the
\emph{change in price of the stock in recent times}

\item Object recognition (from an image):
\begin{itemize}
\item Is this a car or a giraffe?
\item Is this a human or a non-human?
\item Is this an inanimate object or a living object?
\item Face recognition---is this Tom or Dick or Mary or Einstein or Messi?
\end{itemize}
\item Action recognition from a video:
\begin{itemize}
\item Is this person running or not running?
\item Is this person picking something up or not?
\item Is this person doing something violent or not?
\end{itemize}
\item Natural language processing (NLP) from digital documents:
\begin{itemize}
\item Does this news article belong to the realm of politics or sports?
\item Does this query phrase match a particular article in the archive?
\end{itemize}
\end{itemize}

\subsection{Second Level Head}

Another instance of quantitative estimation is estimating a house's price based on inputs like current income of the house's owner, crime statistics for the neighborhood, and so on.
Machines that make such quantitative estimators are called \textit{regressors}.

\end{document}

第一页以文本结尾in the archive?

在此处输入图片描述

是否有可能将每一页的结束文本捕获到单独的输出文件中,例如:

第 1 页结束于in the archive?

第 2 页结束于..so and so...

好心提醒

答案1

如果你对上面生成的 pdf 运行 pdftotext,你会得到一个文本文件,其中的页面由换页符(ctrl-L)分隔,结尾为 –

 Is this person doing something violent or not?
• Natural language processing (NLP) from digital documents:
– Does this news article belong to the realm of politics or sports?
– Does this query phrase match a particular article in the archive?

^L2

0.1.1

Second Level Head

Another instance of quantitative estimation is estimating a house’s price based
on inputs like current income of the house’s owner, crime statistics for the
neighborhood, and so on. Machines that make such quantitative estimators are
called regressors.

^L

因此,如果您删除空白行,然后抓住换页符上方的每一行,您或多或少会得到所需的文本:

$ sed -e '/^[ \t]*$/d' aa072.txt |grep -B1 -P "\x0C" 
– Does this query phrase match a particular article in the archive?

2
--
called regressors.

相关内容