我有一组数据,我想强调两个不同组之间的差异。为此,我想构建一个类似于下图的图表。
我的问题是如何绘制图表上第一个箱线图的原始数据和分布曲线,如上图所示。以下是我用来为数据创建三个箱线图的代码。
\documentclass{standalone}
\usepackage{pgfplots}
\pgfplotsset{compat = 1.8}
\usepgfplotslibrary{statistics}
\begin{document}
\begin{tikzpicture}
\begin{axis}
[
boxplot/draw direction = y,
ylabel = {lmean},
xtick = {1, 2, 3},
xticklabels = {A, B, AB},
every axis plot/.append style = {fill, fill opacity = .1},
]
\addplot + [
mark = *,
boxplot,
black
]
table [row sep = \\, y index = 0] {
data \\
23.427 \\
23.44604 \\
24.38042 \\
23.38132 \\
23.42772 \\
23.442 \\
22.95047 \\
22.99269 \\
23.40156 \\
23.56823 \\
22.06118 \\
22.75578 \\
19.4876 \\
25.21417 \\
21.41917 \\
20.25453 \\
21.32184 \\
21.56096 \\
20.91323 \\
19.50758 \\
20.6354 \\
21.69172 \\
20.00605 \\
18.33375 \\
20.35927 \\
18.95515 \\
20.18885 \\
18.02808 \\
21.67389 \\
18.0061 \\
};
\addplot + [
mark = *,
boxplot,
black
]
table [row sep = \\, y index = 0] {
data \\
20.25453 \\
21.32184 \\
21.56096 \\
20.91323 \\
19.50758 \\
20.6354 \\
21.69172 \\
20.00605 \\
18.33375 \\
20.35927 \\
18.95515 \\
20.18885 \\
18.02808 \\
21.67389 \\
18.0061 \\
};
\addplot + [
mark = *,
boxplot,
black
]
table [row sep = \\, y index = 0] {
data \\
23.427 \\
23.44604 \\
24.38042 \\
23.38132 \\
23.42772 \\
23.442 \\
22.95047 \\
22.99269 \\
23.40156 \\
23.56823 \\
22.06118 \\
22.75578 \\
19.4876 \\
25.21417 \\
21.41917 \\
};
\end{axis}
\end{tikzpicture}
\end{document}
通过此代码我得到了下图:
有人能帮助我吗?提前感谢您的关注!
答案1
这很可能是重新发明了一些轮子,但你可以计算组合数据的平均值和方差并绘制正态分布,类似于所做的,例如这里。顺便说一句,让 pgfplots 合并数据可能更容易。
\documentclass[tikz,border=3mm]{standalone}
\usepackage{pgfplots}
\usepackage{pgfplotstable}
\pgfplotsset{compat = 1.17}
\usepgfplotslibrary{statistics}
\begin{document}
\begin{tikzpicture}
\pgfplotstableread[row sep = \\]{%
data \\
20.25453 \\
21.32184 \\
21.56096 \\
20.91323 \\
19.50758 \\
20.6354 \\
21.69172 \\
20.00605 \\
18.33375 \\
20.35927 \\
18.95515 \\
20.18885 \\
18.02808 \\
21.67389 \\
18.0061 \\
}\dataA
\pgfplotstableread[row sep = \\]{%
data \\
23.427 \\
23.44604 \\
24.38042 \\
23.38132 \\
23.42772 \\
23.442 \\
22.95047 \\
22.99269 \\
23.40156 \\
23.56823 \\
22.06118 \\
22.75578 \\
19.4876 \\
25.21417 \\
21.41917 \\
}\dataB
% combine tables A and B to AB (see https://tex.stackexchange.com/a/188492/194703)
\pgfplotstablevertcat{\dataAB}{\dataA} %
\pgfplotstablevertcat{\dataAB}{\dataB} %
\pgfplotstablegetrowsof{\dataAB}%
\pgfmathtruncatemacro{\numrows}{\pgfplotsretval-1}%
% compute sum, minimum, maximum and average
\edef\sumAB{0}%
\pgfplotsforeachungrouped\X in{0,...,\numrows}
{\pgfplotstablegetelem{\X}{[index]0}\of\dataAB
\pgfmathsetmacro{\sumAB}{\sumAB+\pgfplotsretval}%
\ifnum\X=0
\pgfmathsetmacro{\minAB}{\pgfplotsretval}%
\pgfmathsetmacro{\maxAB}{\pgfplotsretval}%
\else
\pgfmathsetmacro{\minAB}{min(\minAB,\pgfplotsretval)}%
\pgfmathsetmacro{\maxAB}{max(\maxAB,\pgfplotsretval)}%
\fi
}
\pgfmathsetmacro{\averageAB}{\sumAB/(\numrows+1)}
% compute variance
\edef\varsqAB{0}%
\pgfplotsforeachungrouped\X in{0,...,\numrows}
{\pgfplotstablegetelem{\X}{[index]0}\of\dataAB
\pgfmathsetmacro{\varsqAB}{\varsqAB+pow(\pgfplotsretval-\averageAB,2)}%
}
\begin{axis}
[xmin=-1,width=12cm,height=8cm,
ylabel = {lmean},
xtick = {1, 2, 3},
xticklabels = {AB, A, B},
every axis plot/.append style = {fill, fill opacity = .1},
]
\addplot[domain=\minAB:\maxAB,fill=none] ({-1.5+1.5*exp(-pow(x-\averageAB,2)/\varsqAB)},x);
\addplot + [boxplot/draw direction = y,
mark = *,
boxplot,
black
]
table [y index = 0] {\dataAB};
\addplot + [boxplot/draw direction = y,
mark = *,
boxplot,
black
]
table [y index = 0] {\dataA};
\typeout{\pgfplotsboxplotvalue{average}}
\addplot + [boxplot/draw direction = y,
mark = *,
boxplot,
black
]
table [y index = 0] {\dataB};
\end{axis}
\end{tikzpicture}
\end{document}