我有很多.csv
文件需要绘图。但是,数字数据有时会出现在引号中或只是数字(没有引号)。
例如
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
我的问题是,是否有选项pgfplots
可以预处理数据并删除引号(如果存在)。或者从这里(LaTeX 和朋友)执行此操作太麻烦,我应该生成外部脚本来预处理数据。
\documentclass{article}
\usepackage{pgfplots}
\usepackage{filecontents}
\begin{filecontents*}{data.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\begin{document}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=a, y=c, col sep=comma] {data.csv};
\end{axis}
\end{tikzpicture}
\end{document}
答案1
好的,继续寻找解决方案后,我发现pgfplotstable
可以通过选项自动忽略一些字符ignore chars
。
我现在的解决方案是
\documentclass{article}
\usepackage{pgfplots, pgfplotstable}
\usepackage{filecontents}
\begin{filecontents*}{data.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\pgfplotstableread[col sep = comma, ignore chars={"}]{data.csv}\mydata
\begin{document}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=a, y=c] {\mydata};
\end{axis}
\end{tikzpicture}
\end{document}
我不确定这是否是最好的解决方案,或者它是否可靠。如果我发现问题,我会报告。
答案2
如果你不受束缚pgfplots
,dataplot
可以处理 CSV 文件中的引号。它的功能要有限得多pgfplots
,如果您的实际数据比 MWE 中提供的数据大很多,它可能无法很好地处理。但是,我想我也可以将其添加为可能的替代方案,以防它有用。
由于dataplot
需要datatool
先加载数据,然后才能绘制数据:
\documentclass{article}
\usepackage{dataplot}
\usepackage{filecontents}
\begin{filecontents*}{data.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\DTLloaddb{data}{data.csv}
\begin{document}
\DTLplot{data}{x=a,y=c}
\end{document}
您可以使用额外的键和各种命令更改默认外观。例如:
\documentclass{article}
\usepackage{dataplot}
\usepackage{filecontents}
\begin{filecontents*}{data.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\DTLloaddb{data}{data.csv}
\begin{document}
\setcounter{DTLplotroundXvar}{0}% round x tic values to 0 dp
\setcounter{DTLplotroundYvar}{0}% round y tic values to 0 dp
\DTLplot{data}{colors={blue},% line colours
x=a,% column for x values
y=c,% column for y values
box,% box around plot
style=both,% lines and markers
marks={\pgfuseplotmark{*}},% filled circle markers
minx=0,% minimum value on x-axis
maxx=6,% minimum value on x-axis
miny=0,% minimum value on x-axis
maxy=8,% maximum value on x-axis
xticgap=1,% gap between tic marks on x-axis
yticgap=2,% gap between tic marks on y-axis
axes=both% both axes
}
\end{document}
得出的结果为:
答案3
危险的方法:设置\catcode`"=9
;更安全的方法:删除引号。
\documentclass{article}
\usepackage{pgfplots}
\usepackage{filecontents}
\begin{filecontents*}{\jobname.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\begin{filecontents*}{\jobname-noquote.csv}
a,b,c,d
1,4,5,1
2,3,1,5
3,5,6,1
4,1,4,9
5,3,4,7
\end{filecontents*}
\begin{document}
\begin{tikzpicture}
\begin{axis}\catcode`"=9
\addplot table [x=a, y=c, col sep=comma] {\jobname.csv};
\end{axis}
\end{tikzpicture}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=a, y=c, col sep=comma] {\jobname-noquote.csv};
\end{axis}
\end{tikzpicture}
\end{document}
ignore chars
感谢 Christian Feuersänger!我们可以使用手册第 2.1 节中的密钥pgfplotstable
。
\documentclass{article}
\usepackage{pgfplots}
\usepackage{filecontents}
\begin{filecontents*}{\jobname.csv}
a,b,c,d
"1","4","5","1"
"2","3","1","5"
"3","5","6","1"
4,1,4,9
5,3,4,7
\end{filecontents*}
\begin{filecontents*}{\jobname-noquote.csv}
a,b,c,d
1,4,5,1
2,3,1,5
3,5,6,1
4,1,4,9
5,3,4,7
\end{filecontents*}
\begin{document}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=a, y=c, col sep=comma,ignore chars={"}] {\jobname.csv};
\end{axis}
\end{tikzpicture}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=a, y=c, col sep=comma] {\jobname-noquote.csv};
\end{axis}
\end{tikzpicture}
\end{document}