如何将侧面标题放在包裹在文本中的图形中?

如何将侧面标题放在包裹在文本中的图形中?

我相对较新,很难将不同的软件包组合在一起。我试图将一个图形包装在我的文本中,并且我还希望标题与文本的侧面对齐。我目前的方法如下所示,而红色箭头显示了我想要改进的地方:

在此处输入图片描述

为了将图像包装在文本中,我使用了wrapfig如下包:

\begin{wrapfigure}{R}{0.5\textwidth}
    \vspace{-25pt}
    \centering
    \includegraphics[scale=0.41]{figures/transformer3.png}
    \caption[Illustration of multiheaded attention]{Illustration of multiheaded attention. The two highlighted attention heads have learned to associate \textit{"it"} with different parts of the sentence.}
    \label{fig:transformer3}
\end{wrapfigure}

\noindent The projections are parameter matrices $\bm{W}_{i}^{Q} \in \mathbb{R}^{d \times d_q}$, $\bm{W}_{i}^{K} \in \mathbb{R}^{d \times d_k}$, $\bm{W}_{i}^{V} \in \mathbb{R}^{d \times d_v}$ and $\bm{W}^{O} \in \mathbb{R}^{hd_v \times d}$. By applying multiple attention heads, the model is allowed to jointly attend to information at different positions within the input sequence. In figure \ref{fig:transformer3} for example, the orange attention head associates \textit{“it”} with \textit{“The animal”}, while the green attention head has learned an association to “tired”.

\subsubsection*{Outlook on the Empirical Studies}
While the U-Net and the stacked hourglass are already well established architectures in the CV domain, Transformers have been mainly applied on NLP problems so far. However, there is a strong belief within the deep learning community that Transformers may represent a suitable architecture for CV tasks as well. For this reason, the empirical study will investigate on recent approaches to apply self-attention based networks on images. The concepts will then be implemented in a neural network that will be trained on a CV task. Finally, the performance will be evaluated against models that instead rely on the U-Net and the stacked hourglass.

对于侧边字幕,我读到该包floatrow应该很有用。但是,当我尝试将两者结合起来时,我得到了编译错误。我还找到了一个介绍性的用法这里。再次,我可以重现这一点,但在这种情况下,我很难将其与我的文本正确对齐。有人能帮我吗?非常感谢!

答案1

相同的基础设置很脆弱。我宁愿坚持使用其中两个,一个用于中断文本的浮点数,另一个用于wrapfig文本中的浮点数:

在此处输入图片描述

\documentclass{article}
\usepackage{amssymb, bm}
\usepackage[export]{adjustbox}
\usepackage{wrapfig}
\usepackage[outercaption]{sidecap}
\makeatletter
\def\SC@figure@vpos{m}
\makeatother
\usepackage{tabularx}
\usepackage[font={small, sf},labelfont=bf]{caption}

\begin{document}
\noindent The projections are parameter matrices $\bm{W}_{i}^{Q} \in \mathbb{R}^{d \times d_q}$, $\bm{W}_{i}^{K} \in \mathbb{R}^{d \times d_k}$, $\bm{W}_{i}^{V} \in \mathbb{R}^{d \times d_v}$ and $\bm{W}^{O} \in \mathbb{R}^{hd_v \times d}$. By applying multiple attention heads, the model is allowed to jointly attend to information at different positions within the input sequence.
\begin{SCfigure}[50][ht]
    \centering
    \includegraphics[scale=0.41]{example-image-duck}%{figures/transformer3.png}
    \caption[Illustration of multiheaded attention]
            {Illustration of multiheaded attention. The two highlighted attention heads have learned to associate \textit{"it"} with different parts of the sentence.}
    \label{fig:transformer3}
\end{SCfigure}
In figure \ref{fig:transformer3} for example, the orange attention head associates \textit{“it”} with \textit{“The animal”}, while the green attention head has learned an association to “tired”.

\subsubsection*{Outlook on the Empirical Studies}
\begin{wrapfigure}[5]{R}{0.65\textwidth}
\vspace{-1.75\baselineskip}
    \begin{tabularx}{\linewidth}{@{} cX @{}}
    \includegraphics[scale=0.41,valign=T]{example-image-duck}%{figures/transformer3.png}
    &
    \caption[Illustration of multiheaded attention]
            {Illustration of multiheaded attention. The two highlighted attention heads have learned to associate \textit{"it"} with different parts of the sentence.}
    \label{fig:transformer3}
    \end{tabularx}
    \end{wrapfigure}
While the U-Net and the stacked hourglass are already well established architectures in the CV domain, Transformers have been mainly applied on NLP problems so far. However, there is a strong belief within the deep learning community that Transformers may represent a suitable architecture for CV tasks as well. For this reason, the empirical study will investigate on recent approaches to apply self-attention based networks on images. The concepts will then be implemented in a neural network that will be trained on a CV task. Finally, the performance will be evaluated against models that instead rely on the U-Net and the stacked hourglass.
\end{document}

答案2

这显示了如何使用paracol。唯一的问题是您必须使用\splitpar和手动拆分段落\continuepar。另一方面,paracol它比 更强大wrapfig

\documentclass{article}
\usepackage{amssymb, bm}
\usepackage[export]{adjustbox}
\usepackage{paracol}
\usepackage[font={small, sf},labelfont=bf]{caption}

\newsavebox{\textbox}
\newcommand{\splitpar}[2][\textwidth]{% #1 = width of column (optional), #2 = rest of paragraph after split
  \unskip\strut{\parfillskip=0pt\parskip=0pt\par}%
  \global\setbox\textbox=\vbox{\hsize=#1\relax\noindent\strut #2\strut}}
\newcommand{\continuepar}{\unvbox\textbox}

\begin{document}
\setcolumnwidth{\dimexpr 0.5\textwidth-\columnsep}% second column uses remainder
\begin{paracol}{2}
\sloppy% SOP for narrow columns
\noindent The projections are parameter matrices $\bm{W}_{i}^{Q} \in \mathbb{R}^{d \times d_q}$, $\bm{W}_{i}^{K} \in \mathbb{R}^{d \times d_k}$, $\bm{W}_{i}^{V} \in \mathbb{R}^{d \times d_v}$ and $\bm{W}^{O} \in \mathbb{R}^{hd_v \times d}$. By applying multiple attention heads, the model is allowed to jointly attend to information at different positions within the input sequence.
In figure \ref{fig:transformer3} for example, the orange attention head associates \textit{“it”} with \textit{“The animal”}, while the green attention head has learned an association to “tired”.

\switchcolumn
\begin{figure}[h!]
    \includegraphics[width=\linewidth, height=4in]{example-image}
\end{figure}
\switchcolumn
\begin{figure}[h]
    \caption[Illustration of multiheaded attention]
            {Illustration of multiheaded attention. The two highlighted attention heads have learned to associate \textit{"it"} with different parts of the sentence.}
    \label{fig:transformer3}
\end{figure}

\subsubsection*{Outlook on the Empirical Studies}
While the U-Net and the stacked hourglass are already well established architectures in the CV domain, 
Transformers have been mainly appl-\splitpar{ied on NLP problems so far. However, there is a strong belief within the deep learning community that Transformers may  represent a suitable architecture for CV tasks as well. For this reason, the empirical study will investigate on recent approaches to apply self-attention based networks on images. The concepts will then be implemented in a neural network that will be trained on a CV task. Finally, the performance will be evaluated against models that instead rely on the U-Net and the stacked hourglass.}
\end{paracol}
\continuepar
\end{document}

演示

相关内容