\output 处于活动状态时,\vbox 溢出

\output 处于活动状态时,\vbox 溢出

我有以下 MWE:

\documentclass[]{scrbook}

\usepackage{longtable}
\usepackage{tabularx, booktabs}
\usepackage{multirow}

\begin{document}

\chapter{Discussion}

In this chapter the previously mentioned results will be discussed before conclusions are drawn from this in the next chapter.

\section{Computation Time}
In table \ref{tab:computationtime} the computation time of every experiment is clearly summarized.

Catching the eye are several outliers like \textsl{CTGAN cashier shift 1} and \textsl{CTGAN cashier shift 7} with long durations compared to the other experiments. One can assume, that the shifting and therefore a bigger training dataset for the fitting of CTGAN causes the long computation. But in accordance to this explanation, \textsl{CTGAN cashier shift 1 7} should have an even longer computation time which is not the case. Therefore the assumption is close that these results may be due to technical reasons. Increased traffic on the calculating machines at the time of the calculation could have lead to the delays. But this assumption can not be verified and proven. One can argue that the user is only interested in knowing the absolute computation time without paying respect to how many computation resources the algorithm occupies. That would correspond to the time stopping method used in this thesis (see \ref{experimental setup}. But in research we are also interested in the overall efficiency of the algorithm, what could have been evaluated by stopping the CPU time, which measures the computation time while including the CPU usage of the model. This is also possible with the time module with the function process\_time().

The out-of-date version TensorFlow 1 can be one reason for the slow processing of timeGAN, as the current versions of TensorFlow 2 may work more efficient. Also, timeGAN (to the time this thesis was submitted) does not have a CUDA-support. As described in \ref{timeGANrelatedwork}, the training of timeGAN consists of three phases. This extensive process may also cause long computation. 
Additional to the processing speed of timeGAN, other time-consuming, practical arrangements have to be done, like downgrading to a low python version ($\leq 3.7$) to be suitable for TensorFlow 1.

There we can clearly see that DGW in general computes fake data the fastest with 6.26 Minutes in average. Later in section \ref{GANvsDGW} it becomes clear that this was accompanied by losses in performance concerning the quality of the fake data. DGW could even be faster if it would be rewritten in a library like TensorFlow of PyTorch with CUDA-support to use the GPU for calculation, which is not the case so far.

The experiments confirm the general expectation that small datasets lead to faster synthesized data than big datasets. As small we can classify the QuarterlyTouristsIndia and the \textit{MonthlyMilkProduction} dataset. The computation time for the DGW experiments with these two datasets is so short that we can not draw reliable conclusions from that. But from the other experiments it is clear that the univariate \textit{MonthlyMilkProduction} dataset leads to a faster performance that the multivariate (and therefore also bigger) QuarterlyTouristsIndia dataset.

In general it can be observed that there is a great variation in the computation times, even within one group of experiments with the same model and dataset. Additionally the computation time does not follow the expected behaviour that shifting leads to longer durations as the training dataset is augmented as already mentioned before. 
The reason for this unexpected behaviour may be the technical issues with the traffic on the hardware explained earlier. The expected longer duration with shifted datasets occurs too rarely to generalize it as a regularity.

Generally it is recommended let data generating GANs compute on a local storage and not on e.g. a network storage. This can cause long lasting computations because of the high data input/output traffic over the network. 

\newcolumntype{A}[1]{>{\raggedright\let\newline\\\arraybackslash\hspace{0pt}}p{#1}}
\begin{longtable}{A{0.30\textwidth}A{0.19\textwidth}A{0.19\textwidth}A{0.19\textwidth}}
    \caption{Computation time of the experiments.}
    \label{tab:computationtime}\\ \toprule
    \small
        \textbf{Experiment} & \textbf{Computation Time} & \textbf{Average per model and dataset} & \textbf{Average per model} \\ \toprule 
        \textsl{CTGAN cashier no shift} & 45 Minutes & \multirow{4}{*}{5.55 Hours} & \multirow{12}{*}{2.41 Hours} \\ 
        \textsl{CTGAN cashier shift 1} & 10.45 Hours &  & \\ 
        \textsl{CTGAN cashier shift 7} & 10.35 Hours &  & \\ 
        \textsl{CTGAN cashier shift 1 7} & 41 Minutes &  & \\ \cmidrule{1-3}
        \textsl{CTGAN milk no shift} & 9 Minutes & \multirow{4}{*}{7.08 Minutes} & \\ 
        \textsl{CTGAN milk shift 1} & 1 Minutes &  & \\
        \textsl{CTGAN milk shift 7} & 18 Minutes &  & \\ 
        \textsl{CTGAN milk shift 1 7} & 18.6 Seconds &  & \\ \cmidrule{1-3}
        \textsl{CTGAN india no shift} & 39 Minutes & \multirow{4}{*}{1.56 Hours} & \\ 
        \textsl{CTGAN india shift 1} & 2.25 Hours &  & \\ 
        \textsl{CTGAN india shift 7} & 6 Minutes &  & \\ 
        \textsl{CTGAN india shift 1 7} & 3.25 Hours &  & \\ \midrule
        \textsl{timeGAN cashier no shift} & 1.18 Hours & \multirow{4}{*}{} & \multirow{12}{*}{} \\ 
        \textsl{timeGAN cashier shift 1} & 6.85 Hours &  & \\ 
        \textsl{timeGAN cashier year noshift} &  &  & \\ 
        \textsl{timeGAN cashier year shift 1} &  &  & \\ \cmidrule{1-3}
        \textsl{timeGAN milk no shift} & 35 Minutes & \multirow{4}{*}{56.7 Minutes} & \\ 
        \textsl{timeGAN milk shift 1} & 1.93 Hours &  & \\ 
        \textsl{timeGAN milk year noshift} & 7 Minutes &  & \\ 
        \textsl{timeGAN milk year shift 1} & 1.15 Hours &  & \\ \cmidrule{1-3}
        \textsl{timeGAN india no shift} & 12 Minutes & \multirow{4}{*}{1.5 Hours} & \\ 
        \textsl{timeGAN india shift 1} & 48 Minutes &  & \\ 
        \textsl{timeGAN india year noshift} & 4.9 Hours &  & \\ 
        \textsl{timeGAN india year shift 1} & 7 Minutes &  & \\ \midrule
        \textsl{DGW cashier no shift} & 31 Minutes & \multirow{4}{*}{18.78 Minutes} & \multirow{16}{*}{58.7 Minutes} \\ 
        \textsl{DGW cashier shift 1} & 1.25 Hours &  & \\ 
        \textsl{DGW cashier shift 7} & 3 Seconds &  & \\ 
        \textsl{DGW cashier shift 1 7} & 3.6 Seconds &  & \\ \cmidrule{1-3}
        \textsl{DGW schacht no shift} & 1.4 Hours & \multirow{4}{*}{36 Hours} & \\ 
        \textsl{DGW schacht shift 1} & 8.6 Hours &  & \\ 
        \textsl{DGW schacht shift 7} & 43.1 Hours &  & \\ 
        \textsl{DGW schacht shift 1 7} & 90.89 Hours &  & \\ \cmidrule{1-3}
        \textsl{DGW milk no shift} & 0.12 Seconds & \multirow{4}{*}{0.072 Seconds} & \\ 
        \textsl{DGW milk shift 1} & 0.06 Seconds &  & \\ 
        \textsl{DGW milk shift 7} & 0.06 Seconds &  & \\ 
        \textsl{DGW milk shift 1 7} & 0.048 Seconds &  & \\ \cmidrule{1-3}
        \textsl{DGW india no shift} & 0.036 Seconds & \multirow{4}{*}{0.033 Seconds} & \\ 
        \textsl{DGW india shift 1} & 0.036 Seconds &  & \\ 
        \textsl{DGW india shift 7} & 0.03 Seconds &  & \\ 
        \textsl{DGW india shift 1 7} & 0.03 Seconds &  & \\ \bottomrule
\end{longtable}

\end{document}

这并没有完全重现我想要谈论的警告,对此感到抱歉,但我无法创建一个确实如此的 MWE......

我收到的警告是:

Overfull \vbox (11.77737pt too high) has occurred while \output is active []

警告在日志文件中显示如下:

(chapters/05_discussion.tex [48]
chapter 5.
[49
] [50] [51]
Overfull \vbox (11.77737pt too high) has occurred while \output is active []

我知道你可以调整\setlength\headheight{10pt}\output 处于活动状态时,发生 \vbox 过满(7.96234pt 太高)),但这也没有解决警告。因此,我也提出了这个问题,因为它不是重复的,因为这个解决方案不起作用。

答案1

本章中有一个表格太大了。我尝试将其缩小到文本高度,这样警告应该会消失。抱歉给您带来不便,我错误地读取了日志文件...

相关内容