箱线图中的晶须和异常值问题

箱线图中的晶须和异常值问题

我正在尝试使用 LaTeX 从 MATLAB 中重现箱线图。不幸的是,箱线图似乎无法绘制异常值,并且晶须也存在一些问题,见图。 在此处输入图片描述

前两个框的须状线似乎有问题:它们根本不应该存在。
此外,在整个图中,异常值都缺失了,而在 MATLAB 版本中,它们存在。这是 MATLAB 版本:

在此处输入图片描述

这是 LaTeX 代码:

\documentclass[11pt,a4paper,twoside,openright]{report}
\usepackage{tikz}
\usepackage{pgfplots}
\usetikzlibrary{pgfplots.statistics}
\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage[T1]{fontenc} 
\usepackage[utf8]{inputenc}
\usepackage{color}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{array} 
\usepackage{verbatim}
\usepackage{caption}
\usepackage{epstopdf}

\usepackage{caption}
\usepackage{subcaption}
\usepackage{tcolorbox}


\begin{document}
\begin{tikzpicture}
\begin{semilogyaxis}[
boxplot/draw direction=y,
xtick={1,2,3,4,5,6},
xticklabels={3, 4, 5, 6, 7, 8},
xlabel=number,
ylabel={time[s]},
boxplot/variable width,
boxplot/whisker range={1.57},
]
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
0.090000 \\ 0.440000 \\ 0.120000 \\ 0.060000 \\ 0.320000 \\ 0.230000 \\ 0.440000 \\ 0.020000 \\ 0.150000 \\ 0.180000 \\ 0.000000 \\ 0.290000 \\ 0.000000 \\ 0.110000 \\ 0.260000 \\ 0.110000 \\ 0.000000 \\ 0.450000 \\ 0.040000 \\ 0.140000 \\ 0.030000 \\ 0.120000 \\ 0.140000 \\ 0.310000 \\ 0.060000 \\ 0.060000 \\ 0.110000 \\ 0.120000 \\ 0.120000 \\ 0.120000 \\ 0.130000 \\ 0.010000 \\ 0.400000 \\ 0.010000 \\ 0.030000 \\ 0.170000 \\ 0.000000 \\ 0.100000 \\ 0.150000 \\ 0.160000 \\ 0.060000 \\ 0.100000 \\ 0.010000 \\ 0.600000 \\ 0.260000 \\ 0.110000 \\ 0.150000 \\ 0.220000 \\ 0.140000 \\ 0.010000 \\ 
};
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
0.070000 \\ 0.490000 \\ 0.340000 \\ 0.200000 \\ 0.020000 \\ 1.080000 \\ 6.830000 \\ 0.310000 \\ 0.540000 \\ 0.020000 \\ 0.290000 \\ 0.180000 \\ 0.600000 \\ 0.090000 \\ 0.610000 \\ 1.370000 \\ 0.260000 \\ 0.030000 \\ 2.300000 \\ 0.090000 \\ 3.150000 \\ 0.130000 \\ 0.290000 \\ 0.270000 \\ 1.300000 \\ 0.730000 \\ 0.630000 \\ 0.240000 \\ 10.030000 \\ 0.000000 \\ 0.260000 \\ 0.180000 \\ 3.290000 \\ 2.430000 \\ 1.940000 \\ 0.220000 \\ 0.230000 \\ 0.600000 \\ 1.690000 \\ 0.350000 \\ 3.960000 \\ 0.560000 \\ 9.900000 \\ 0.100000 \\ 0.430000 \\ 0.220000 \\ 0.260000 \\ 0.310000 \\ 0.290000 \\ 0.790000 \\ 
};
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
12.700000 \\ 1.340000 \\ 0.680000 \\ 0.510000 \\ 1.770000 \\ 0.040000 \\ 3.790000 \\ 287.050000 \\ 1.350000 \\ 5.410000 \\ 15.560000 \\ 3.130000 \\ 0.910000 \\ 7.480000 \\ 2.400000 \\ 1.040000 \\ 3.530000 \\ 0.580000 \\ 31.710000 \\ 7.890000 \\ 4.900000 \\ 2.610000 \\ 0.890000 \\ 0.030000 \\ 3.780000 \\ 8.110000 \\ 4.820000 \\ 1.020000 \\ 5.570000 \\ 8.850000 \\ 0.150000 \\ 17.590000 \\ 0.210000 \\ 8.100000 \\ 2.150000 \\ 3.430000 \\ 6.440000 \\ 1.650000 \\ 6.830000 \\ 23.540000 \\ 0.520000 \\ 1.470000 \\ 0.750000 \\ 3.540000 \\ 3.590000 \\ 5.560000 \\ 0.330000 \\ 8.580000 \\ 1.900000 \\ 0.780000 \\ 
};
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
55.720000 \\ 14.910000 \\ 14.950000 \\ 6.010000 \\ 6.530000 \\ 88.300000 \\ 281.500000 \\ 40.150000 \\ 13.410000 \\ 0.910000 \\ 1.650000 \\ 44.320000 \\ 13.410000 \\ 7.330000 \\ 3.510000 \\ 3.440000 \\ 70.400000 \\ 0.750000 \\ 58.200000 \\ 54.880000 \\ 26.450000 \\ 33.760000 \\ 0.700000 \\ 0.050000 \\ 0.290000 \\ 57.120000 \\ 14.300000 \\ 31.110000 \\ 18.560000 \\ 0.480000 \\ 21.330000 \\ 1.150000 \\ 2.220000 \\ 3.880000 \\ 1.780000 \\ 151.250000 \\ 7.770000 \\ 137.920000 \\ 0.500000 \\ 3.010000 \\ 1.990000 \\ 23.180000 \\ 119.590000 \\ 17.500000 \\ 15.870000 \\ 13.630000 \\ 21.850000 \\ 23.530000 \\ 68.720000 \\ 2.900000 \\ 
};
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
1.190000 \\ 1.940000 \\ 13.400000 \\ 7.400000 \\ 267.300000 \\ 5.940000 \\ 11.050000 \\ 6.510000 \\ 2.940000 \\ 5.450000 \\ 5.240000 \\ 231.000000 \\ 4.480000 \\ 0.680000 \\ 311.290000 \\ 77.470000 \\ 621.200000 \\ 139.080000 \\ 1933.590000 \\ 2.520000 \\ 100.960000 \\ 11.020000 \\ 153.430000 \\ 26.670000 \\ 83.840000 \\ 4.310000 \\ 106.340000 \\ 15.900000 \\ 1118.590000 \\ 9.490000 \\ 131.480000 \\ 48.920000 \\ 5.850000 \\ 3.740000 \\ 1.050000 \\ 32.030000 \\ 5.690000 \\ 45.100000 \\ 12.430000 \\ 238.560000 \\ 28.750000 \\ 1.010000 \\ 119.290000 \\ 12.090000 \\ 31.180000 \\ 16.600000 \\ 29.670000 \\ 138.550000 \\ 17.420000 \\ 0.830000 \\ 
};
\addplot[boxplot,box extend=2]
table[row sep=\\,y index=0] {
data\\
2077.450000 \\ 762.100000 \\ 469.000000 \\ 143.600000 \\ 685.000000 \\ 3600.000000 \\ 20.200000 \\ 249.600000 \\ 269.000000 \\ 0.300000 \\ 0.200000 \\ 779.400000 \\ 1.800000 \\ 146.800000 \\ 1.300000 \\ 32.500000 \\ 137.000000 \\ 2016.400000 \\ 2.300000 \\ 33.900000 \\ 801.600000 \\ 2.200000 \\ 646.900000 \\ 3600.000000 \\ 1184.000000 \\ 627.000000 \\ 500.500000 \\ 238.300000 \\ 477.400000 \\ 3600.000000 \\ 17.800000 \\ 1726.800000 \\ 2.000000 \\ 316.700000 \\ 174.500000 \\ 2802.700000 \\ 335.300000 \\ 201.200000 \\ 1.100000 \\ 247.100000 \\ 2705.100000 \\ 156.900000 \\ 5.100000 \\ 2342.500000 \\ 3600.000000 \\ 3600.000000 \\ 72.700000 \\ 47.400000 \\ 301.200000 \\ 1.600000 \\ 
};
\end{semilogyaxis}
\end{tikzpicture}
\end{document}

我也尝试过answer 的代码,但使用我的数据时出现问题。我不明白我做错了什么。

\documentclass{article}
\usepackage{pgfplots}
\usepackage{filecontents}

\begin{filecontents}{testdata.dat}
3 0.090000 0.440000 0.120000 0.060000 0.320000 0.230000 0.440000 0.020000 0.150000 0.180000 0.000000 0.290000 0.000000 0.110000 0.260000 0.110000 0.000000 0.450000 0.040000 0.140000 0.030000 0.120000 0.140000 0.310000 0.060000 0.060000 0.110000 0.120000 0.120000 0.120000 0.130000 0.010000 0.400000 0.010000 0.030000 0.170000 0.000000 0.100000 0.150000 0.160000 0.060000 0.100000 0.010000 0.600000 0.260000 0.110000 0.150000 0.220000 0.140000 0.010000 
4 0.070000 0.490000 0.340000 0.200000 0.020000 1.080000 6.830000 0.310000 0.540000 0.020000 0.290000 0.180000 0.600000 0.090000 0.610000 1.370000 0.260000 0.030000 2.300000 0.090000 3.150000 0.130000 0.290000 0.270000 1.300000 0.730000 0.630000 0.240000 10.030000 0.000000 0.260000 0.180000 3.290000 2.430000 1.940000 0.220000 0.230000 0.600000 1.690000 0.350000 3.960000 0.560000 9.900000 0.100000 0.430000 0.220000 0.260000 0.310000 0.290000 0.790000 
5 12.700000 1.340000 0.680000 0.510000 1.770000 0.040000 3.790000 287.050000 1.350000 5.410000 15.560000 3.130000 0.910000 7.480000 2.400000 1.040000 3.530000 0.580000 31.710000 7.890000 4.900000 2.610000 0.890000 0.030000 3.780000 8.110000 4.820000 1.020000 5.570000 8.850000 0.150000 17.590000 0.210000 8.100000 2.150000 3.430000 6.440000 1.650000 6.830000 23.540000 0.520000 1.470000 0.750000 3.540000 3.590000 5.560000 0.330000 8.580000 1.900000 0.780000 
6 55.720000 14.910000 14.950000 6.010000 6.530000 88.300000 281.500000 40.150000 13.410000 0.910000 1.650000 44.320000 13.410000 7.330000 3.510000 3.440000 70.400000 0.750000 58.200000 54.880000 26.450000 33.760000 0.700000 0.050000 0.290000 57.120000 14.300000 31.110000 18.560000 0.480000 21.330000 1.150000 2.220000 3.880000 1.780000 151.250000 7.770000 137.920000 0.500000 3.010000 1.990000 23.180000 119.590000 17.500000 15.870000 13.630000 21.850000 23.530000 68.720000 2.900000 
7 1.190000 1.940000 13.400000 7.400000 267.300000 5.940000 11.050000 6.510000 2.940000 5.450000 5.240000 231.000000 4.480000 0.680000 311.290000 77.470000 621.200000 139.080000 1933.590000 2.520000 100.960000 11.020000 153.430000 26.670000 83.840000 4.310000 106.340000 15.900000 1118.590000 9.490000 131.480000 48.920000 5.850000 3.740000 1.050000 32.030000 5.690000 45.100000 12.430000 238.560000 28.750000 1.010000 119.290000 12.090000 31.180000 16.600000 29.670000 138.550000 17.420000 0.830000 
8 2077.450000 762.100000 469.000000 143.600000 685.000000 3600.000000 20.200000 249.600000 269.000000 0.300000 0.200000 779.400000 1.800000 146.800000 1.300000 32.500000 137.000000 2016.400000 2.300000 33.900000 801.600000 2.200000 646.900000 3600.000000 1184.000000 627.000000 500.500000 238.300000 477.400000 3600.000000 17.800000 1726.800000 2.000000 316.700000 174.500000 2802.700000 335.300000 201.200000 1.100000 247.100000 2705.100000 156.900000 5.100000 2342.500000 3600.000000 3600.000000 72.700000 47.400000 301.200000 1.600000 
\end{filecontents}

\pgfplotsset{
    box plot/.style={
        /pgfplots/.cd,
        black,
        only marks,
        mark=-,
        mark size=1em,
        /pgfplots/error bars/.cd,
        y dir=plus,
        y explicit,
    },
    box plot box/.style={
        /pgfplots/error bars/draw error bar/.code 2 args={%
            \draw  ##1 -- ++(1em,0pt) |- ##2 -- ++(-1em,0pt) |- ##1 -- cycle;
        },
        /pgfplots/table/.cd,
        y index=2,
        y error expr={\thisrowno{3}-\thisrowno{2}},
        /pgfplots/box plot
    },
    box plot top whisker/.style={
        /pgfplots/error bars/draw error bar/.code 2 args={%
            \pgfkeysgetvalue{/pgfplots/error bars/error mark}%
            {\pgfplotserrorbarsmark}%
            \pgfkeysgetvalue{/pgfplots/error bars/error mark options}%
            {\pgfplotserrorbarsmarkopts}%
            \path ##1 -- ##2;
        },
        /pgfplots/table/.cd,
        y index=4,
        y error expr={\thisrowno{2}-\thisrowno{4}},
        /pgfplots/box plot
    },
    box plot bottom whisker/.style={
        /pgfplots/error bars/draw error bar/.code 2 args={%
            \pgfkeysgetvalue{/pgfplots/error bars/error mark}%
            {\pgfplotserrorbarsmark}%
            \pgfkeysgetvalue{/pgfplots/error bars/error mark options}%
            {\pgfplotserrorbarsmarkopts}%
            \path ##1 -- ##2;
        },
        /pgfplots/table/.cd,
        y index=5,
        y error expr={\thisrowno{3}-\thisrowno{5}},
        /pgfplots/box plot
    },
    box plot median/.style={
        /pgfplots/box plot
    }
}

\begin{document}
\begin{tikzpicture}
\begin{axis} [ 
        ymode=log,
        enlarge x limits=0.5,
        xtick=data]
    \addplot [box plot median] table {testdata.dat};
    \addplot [box plot box] table {testdata.dat};
    \addplot [box plot top whisker] table {testdata.dat};
    \addplot [box plot bottom whisker] table {testdata.dat};
\end{axis}
\end{tikzpicture}
\end{document}

在此处输入图片描述

答案1

PSTricks 的解决方案:

\documentclass{article}
\usepackage{pst-plot}

\begin{document}
\begin{pspicture}(-1,-2)(9,5)
\psset{fillstyle=solid}
\psaxes[ylogBase=10,Oy=-2,logLines=y,ticksize=0 4pt, subticks=5](1,-2)(9,4)
\rput(3,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=Log]{ 
  0.09 0.44 0.12 0.06 0.32 0.23 0.44 0.02 0.15 0.18 0 0.29 0 0.11 0.26 0.11 0 0.45 0.04 0.14 0.03 0.12 0.14 0.31 0.06 0.06 0.11 0.12 0.12 0.12 0.13 0.01 0.40 0.01 0.03 0.17 0 0.10 0.15 0.16 0.06 0.10 0.01 0.60 0.26 0.11 0.15 0.22 0.14 0.01 }}
 \rput(4,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=Log]{ 
0.07 0.49 0.34 0.20 0.02 1.08 6.83 0.31 0.54 0.02 0.29 0.18 0.60 0.09 0.61 1.37 0.26 0.03 2.30 0.09 3.15 0.13 0.29 0.27 1.30 0.73 0.63 0.24 10.03 0 0.26 0.18 3.29 2.43 1.94 0.22 0.23 0.60 1.69 0.35 3.96 0.56 9.90 0.10 0.43 0.22 0.26 0.31 0.29 0.79 }}
 \rput(5,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=log]{ 
12.70 1.34 0.68 0.51 1.77 0.04 3.79 287.05 1.35 5.41 15.56 3.13 0.91 7.48 2.40 1.04 3.53 0.58 31.71 7.89 4.90 2.61 0.89 0.03 3.78 8.11 4.82 1.02 5.57 8.85 0.15 17.59 0.21 8.10 2.15 3.43 6.44 1.65 6.83 23.54 0.52 1.47 0.75 3.54 3.59 5.56 0.33 8.58 1.90 0.78  }}
 \rput(6,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=log]{ 
55.72 14.91 14.95 6.01 6.53 88.30 281.50 40.15 13.41 0.91 1.65 44.32 13.41 7.33 3.51 3.44 70.40 0.75 58.20 54.88 26.45 33.76 0.70 0.05 0.29 57.12 14.30 31.11 18.56 0.48 21.33 1.15 2.22 3.88 1.78 151.25 7.77 137.92 0.50 3.01 1.99 23.18 119.59 17.50 15.87 13.63 21.85 23.53 68.72 2.90  }}
 \rput(7,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=log]{ 
1.19 1.94 13.40 7.40 267.30 5.94 11.05 6.51 2.94 5.45 5.24 231 4.48 0.68 311.29 77.47 621.20 139.08 1933.59 2.52 100.96 11.02 153.43 26.67 83.84 4.31 106.34 15.90 1118.59 9.49 131.48 48.92 5.85 3.74 1.05 32.03 5.69 45.10 12.43 238.56 28.75 1.01 119.29 12.09 31.18 16.60 29.67 138.55 17.42 0.83  }}
 \rput(8,0){\psBoxplot[fillcolor=red!30,barwidth=0.9cm,postAction=log]{ 
2077.45 762.10 469 143.60 685 3600 20.20 249.60 269 0.30 0.20 779.40 1.80 146.80 1.30 32.50 137 2016.40 2.30 33.90 801.60 2.20 646.90 3600 1184 627 500.50 238.30 477.40 3600 17.80 1726.80 2 316.70 174.50 2802.70 335.30 201.20 1.10 247.10 2705.10 156.90 5.10 2342.50 3600 3600 72.70 47.40 301.20 1.60  }}
\end{pspicture}

\end{document}

需要 pst-plot.tex 来自http://texnik.dante.de/tex/generic/pst-plot/pstricks.pro http://texnik.dante.de/dvips/pstricks/

在此处输入图片描述

答案2

我发现tikz 的 bug:如果您注意到,错误的晶须似乎具有完全相同的值。如果您尝试手动计算它们,它们的结果都是 0。如果有人试图在对数刻度上绘制零值而不进行任何检查,会发生什么?会发生此错误 ;-) 例如,如果您用 0.001 值替换 0.000 值,晶须会正确显示。现在我不知道在哪里标记这个错误。实际上,在我看来,这个错误也在 MATLAB 方面,因为图中应该出现一个晶须(可能是无穷无尽的)。

解决这个问题的方法是删除 0 值并以不同的方式管理框的宽度。

解决此问题的第二个方法是编写一个简单的程序,该程序获取数据并生成自定义箱线图。该程序计算四分位数、中位数、晶须和异常值。当程序发现零晶须时,它会将其替换为接近零的晶须。在这种情况下,我使用了 0.0001。
通过这种方式,我能够生成此图:

在此处输入图片描述

相关内容