查找 .csv 文件中空白行的位置

查找 .csv 文件中空白行的位置

这与我之前问过的一个问题有关 -使用 PGFPlots 和 gnuplot 绘制不同长度的数据块

这是一个最小工作示例:

% !TeX TXS-program:compile = txs:///pdflatex/[--shell-escape]
\documentclass[a4paper,12pt]{article}
\usepackage{pgfplots,filecontents}

\begin{filecontents*}{data.csv}
"Amplitude","notes: data set 1",
X,Y,
1,1,
2,2,
3,3,
4,4,

"Amplitude","notes: data set 2",
X,Y,
1,7,
2,6,
3,5,
4,4,
5,3,
6,2,
7,1,

"CH1","notes: data set 1",
"CH1","notes: data set 2",
\end{filecontents*}

\begin{document}
    \begin{tikzpicture}
        \begin{axis} [width=0.5\textwidth,height=7cm,
        ]
        \addplot gnuplot [raw gnuplot, mark=none, black]{
            set datafile separator comma;
            plot "<(sed -n '3,6p' data.csv)" using 1:2 with lines;
        };
            \addplot gnuplot [raw gnuplot, mark=none, red]{
        set datafile separator comma;
        plot "<(sed -n '10,16p' data.csv)" using 1:2 with lines;
    };
        \end{axis}
    \end{tikzpicture}
\end{document}

我的问题是,有没有办法让 TeX 找到文件中的空白行.csv?使用datatool或类似的东西?如果我能得到空白行的位置,它可以告诉我 (a) 有两个数据集,并且 (b) 我可以计算出每个数据集的长度。

在此示例中,空白行位于第 7 行和第 17 行。因此,数据集 1 从第 3 行变为第 6 行 (7-1),数据集 2 从第 10 行 (7+3) 变为第 16 行 (17-1)。这将使我能够自动生成命令gnuplot并能够回答使用 PGFPlots 和 gnuplot 绘制不同长度的数据块

答案1

使用 xstring 包可能会有帮助:

\documentclass[a4paper,12pt]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{filecontents,xstring}
\begin{filecontents}{data.csv}
"Amplitude","notes: data set 1",
X,Y,
1,1,
2,2,
3,3,
4,4,

"Amplitude","notes: data set 2",
X,Y,
1,7,
2,6,
3,5,
4,4,
5,3,
6,2,
7,1,

"Amplitude","notes: data set 3",
X,Y,
1,2
3,4
5,6,

"CH1","notes: data set 1",
"CH1","notes: data set 2",
\end{filecontents}
\newcount\cntblanklines
\newcount\currentdataline
\begingroup
    \catcode0 12
    \begingroup\everyeof{\noexpand}\endlinechar0\xdef\datas{\csname @@input\endcsname data.csv }\endgroup
    \StrCount\datas{^^00^^00}[\nbblanklines]\global\let\nbblanklines\nbblanklines
    \loop
        \ifnum\cntblanklines<\nbblanklines\relax
        \advance\cntblanklines1
        \StrCut\datas{^^00^^00}\currentdatas\datas
        \StrCount\currentdatas{^^00}[\currentblocklength]%
        \expandafter\xdef\csname blockbegin\romannumeral\cntblanklines\endcsname{\number\numexpr\currentdataline+3}%
        \expandafter\xdef\csname   blockend\romannumeral\cntblanklines\endcsname{\number\numexpr\currentblocklength+1+\currentdataline}%
        \advance\currentdataline\numexpr\currentblocklength+2\relax
    \repeat
\endgroup%
\begin{document}
Number of blank lines : \nbblanklines

Block 1 : \blockbegini--\blockendi

Block 2 : \blockbeginii--\blockendii

Block 3 : \blockbeginiii--\blockendiii
\end{document}

答案2

\readdef我最初将(readarray包)的定义修改为不是忽略输入中的空行。但是,我在这里进行了编辑,以定义\simplereaddef它完全消除了包的开销readarray

\documentclass[a4paper,12pt]{article}
\usepackage{pgfplots,filecontents,listofitems}
\makeatletter
\newread\readfile
\newcommand\simplereaddef[3][,]{%
  \catcode\endlinechar=9 %
  \def#3{}%
  \openin\readfile=#2%
  \loop\unless\ifeof\readfile%
    \read\readfile to\readfileline % Reads a line of the file into \readfileline%
    \expandafter\g@addto@macro\expandafter#3\expandafter{\readfileline}%
    \g@addto@macro#3{#1}% ADD record-delim TO END OF EACH RECORD
  \repeat%
  \closein\readfile%
  \catcode\endlinechar=5 %
}
\makeatother
\begin{filecontents*}{mydata.csv}
"Amplitude","notes: data set 1",
X,Y,
1,1,
2,2,
3,3,
4,4,

"Amplitude","notes: data set 2",
X,Y,
1,7,
2,6,
3,5,
4,4,
5,3,
6,2,
7,1,

"CH1","notes: data set 1",
"CH1","notes: data set 2",
\end{filecontents*}
\begin{document}
\simplereaddef[\\]{mydata.csv}\mydata
\setsepchar{\\}
\readlist\myarray{\mydata}
\noindent\foreachitem\x\in\myarray[]{%
  \ifnum\xcnt<\listlen\myarray[]\relax%
    \expandafter\ifx\expandafter\relax\x\relax Line \xcnt{} blank\\\fi%
  \fi}
\end{document}

在此处输入图片描述

补充

因为总体愿望不仅仅是找到文件的空白记录,而是使用它们来获取由空白记录分隔的原始文件的子列表。这里有一种方法,已编辑以使用listofitems嵌套列表。

\sublist输出中,我还为每个子记录添加了一个前导#),以表明子列表不仅仅是一个文本块,而且是子文本块内单独可访问的记录。

编辑以使用\simplereaddef宏,而不是从\readarray包中修改它。

\documentclass[a4paper,12pt]{article}
\usepackage{pgfplots,filecontents,pgffor,listofitems}
\makeatletter
\newread\readfile
\newcommand\simplereaddef[3][,]{%
  \catcode\endlinechar=9 %
  \def#3{}%
  \openin\readfile=#2%
  \loop\unless\ifeof\readfile%
    \read\readfile to\readfileline % Reads a line of the file into \readfileline%
    \expandafter\g@addto@macro\expandafter#3\expandafter{\readfileline}%
    \g@addto@macro#3{#1}% ADD record-delim TO END OF EACH RECORD
  \repeat%
  \closein\readfile%
  \catcode\endlinechar=5 %
}
\makeatother
\begin{filecontents*}{mydata.csv}
"Amplitude","notes: data set 1",
X,Y,
1,1,
2,2,
3,3,
4,4,

"Amplitude","notes: data set 2",
X,Y,
1,7,
2,6,
3,5,
4,4,
5,3,
6,2,
7,1,

"CH1","notes: data set 1",
"CH1","notes: data set 2",
\end{filecontents*}
\newcommand\sublist[1]{SUBLIST #1:\\\foreachitem\x\in\myarray[#1]{\xcnt) \x\\}\par}
\begin{document}
\simplereaddef[\\]{mydata.csv}\mydata% OPTIONAL ARG IS record-delim (DEFAULT ,)
\setsepchar{\\\\/\\}% OF FORM {2X record-delim / record-delim}
\ignoreemptyitems
\readlist\myarray{\mydata}
Number of non-empty sublists: \listlen\myarray[]

\sublist{1}

\sublist{3}

\sublist{2}
\end{document}

在此处输入图片描述

答案3

这是不完整的思想集合。它仅报告确实pgfplots带有处理空行的方法。并且可以在选项中注入某些内容scanlineempty line=scanline,请参阅手册第 45 页)。我在这里建议的主要内容是通过添加来使用此信息

  \xdef\BlockLength{\pgfplots@scanlinelength}% 

以便\pgfplotsscanlinelength@scanline@complete跟踪块长度。下面要做的是遍历文件,找出块的长度(包括标题)并记录下来。这既不优雅,也没有经过充分测试,更不用说完整的答案了,但似乎通过了一些非常基本的检查。

\documentclass[a4paper,12pt]{article}
\usepackage{pgfplots,filecontents}

\begin{filecontents*}{data.csv}
"Amplitude","notes: data set 1",
X,Y,
1,1,
2,2,
3,3,
4,4,

"Amplitude","notes: data set 2",
X,Y,
1,7,
2,6,
3,5,
4,4,
5,3,
6,2,
7,1,

"Amplitude","notes: data set 3",
X,Y,
3,5,
4,4,
5,3,
6,2,
7,1,

"Amplitude","notes: data set 4",
X,Y,
3,5,
4,4,
5,3,
6,2,
6,2,
7,1,

"CH1","notes: data set 1",
"CH1","notes: data set 2",
\end{filecontents*}
\makeatletter
\def\pgfplotsscanlinelength@scanline@complete{%
        \ifnum\pgfplots@scanlinelength>0
                \ifnum\c@pgfplots@scanlineindex=0
                        %
                        % \pgfplotsscanlinecomplete
                        % \pgfplotsscanlinecomplete
                        % \pgfplotsscanlinecomplete
                        % should have the same effect as a single statement. Do
                        % nothing here.
                \else
                        \ifnum\pgfplots@scanlinelength=\c@pgfplots@scanlineindex\relax
                        \else
%\message{Found inconsistent scan line length: \pgfplots@scanlinelength\space vs. \the\c@pgfplots@scanlineindex\space near line \pgfplotstablelineno.}%
                                % special marker which means 'inconsistent scan line length found'
                                \def\pgfplots@scanlinelength{-2}%
                        \fi
                        \pgfplotsplothandlernotifyscanlinecomplete
                \fi
        \else
                \ifnum\pgfplots@scanlinelength=-2
                \else
                        \edef\pgfplots@scanlinelength{\the\c@pgfplots@scanlineindex}%
                        \xdef\BlockLength{\pgfplots@scanlinelength}%
                \fi
                %
                \ifnum\c@pgfplots@scanlineindex>0
                        \pgfplotsplothandlernotifyscanlinecomplete
                \fi
        \fi
        \c@pgfplots@scanlineindex=0
        \pgfplotsutil@advancestringcounter\pgfplotsscanlineindex%
}
\makeatother
\newsavebox{\NonSense}
\begin{document}
\begin{lrbox}{\NonSense}
\begin{tikzpicture}
\pgfplotstableread[comment chars={"}]{data.csv}\loadedtable
\pgfplotstablegetrowsof{\loadedtable}
\pgfmathtruncatemacro{\DataLines}{\pgfplotsretval+1}
\typeout{\DataLines}
\def\SkipLength{0}
\foreach \X in {0,...,12}
{\begin{axis} 
    \addplot[empty line=scanline] table[x expr=0,y expr=0,skip first n=\SkipLength] {data.csv};
\end{axis}
\ifnum\X=0
\xdef\LstBlocks{\BlockLength}
\pgfmathtruncatemacro{\SkipLength}{\BlockLength+2}
\xdef\SkipLength{\SkipLength}
\else
\xdef\LstBlocks{\LstBlocks,\BlockLength}
\pgfmathtruncatemacro{\SkipLength}{\SkipLength+\BlockLength+2}
\xdef\SkipLength{\SkipLength}
\fi
\ifnum\SkipLength>\DataLines
\breakforeach
\fi
}
\end{tikzpicture}
\end{lrbox}
blocks:\LstBlocks
\end{document}

在此处输入图片描述

因此,这篇文章的重点是报告这个empty line事情,希望一些专家可能会发现它对真实和完整的解决方案有用。

相关内容