PGFplots 中的点和须图

PGFplots 中的点和须图

有时将回归系数及其置信区间以点线图的形式呈现很有用。请参阅这里了解这种方法的动机以及我试图基于dotwhisker设计用于生成这些图的 R 包实现的一些示例。以下是来自链接的 R 小插图的一个例子: 在此处输入图片描述 我想在中复制类似的点须图PGFPLOTS

PGFPLOTS原生支持箱线图——这里就是一个例子。PGF 的箱线特征的完整描述位于手册版本 1.16

这是我的最佳尝试:

\documentclass{article}
\usepackage{pgfplots}
\pgfplotsset{compat=1.16}
\usepgfplotslibrary{statistics}

\begin{document}
\begin{tikzpicture}
\begin{axis}[
    ytick={1,2,3},
    yticklabels={Group A, Group B, Group C},
    ]

\addplot+ [
boxplot prepared={
lower whisker=0.35, 
% lower quartile=,
median=0.6,
% upper quartile=, 
upper whisker=0.85,
},
]
coordinates {};

\addplot+ [
boxplot prepared={
lower whisker=0.55, 
% lower quartile=,
median=0.70,
% upper quartile=, 
upper whisker=0.85,
},
] 
coordinates {};

\addplot+ [
boxplot prepared={
lower whisker=0.85, 
% lower quartile=,
median=0.9,
% upper quartile=, 
upper whisker=0.95,
},
] 
coordinates {};
\end{axis}
\end{tikzpicture}

\end{document}

boxwhisker.PNG

我的尝试存在几个问题:

  1. 我正在调整箱线和须线特征来表示不是箱线的东西。这也许是处理事情的错误方法。我的拼凑物将中位数视为系数估计,将上须和下须视为置信区间的上部和下部。我注释掉了四分位数,因为我没有它们的等价物,而且我不需要这个方框。
  2. R软件包不同dotwhister,图形不清晰——垂直线过分强调了相对于系数点估计的置信界限。
  3. 它不可扩展。我想分析 50 个组。每次更新数据时,我都不想手动更新 中的箱线图条目PGFPLOTS。我希望能够从 .csv包含每个组 50 行和四列的数据中读取:组名、系数估计、上置信区间、下置信区间。

以下是一些随机生成的数据(upper并且lower围绕对称point_est):

\begin{filecontents*}{dotwhisker.csv}
    group,point_est,upper,lower
    SCBQD6600C,0.318940138,0.782642805,-0.144762529
    GHECK1046A,0.541614386,1.425115639,-0.341886867
    ICOOO3242S,0.662666177,1.143455809,0.181876544
    PHOVQ7028A,0.148145345,0.239989182,0.056301508
    HSJEK0588Y,0.564368703,0.997673282,0.131064125
    CYVFG8255L,0.575908384,1.288811424,-0.136994656
    ZDYRJ3242S,0.413789006,0.639376662,0.18820135
    PXQSX1684J,0.418005222,0.974470232,-0.138459788
    VTCRK0417U,0.4153322,1.020437688,-0.189773288
    WSYWC4669M,0.366494326,0.756315385,-0.023326734
    BZPKZ2934L,0.428421095,0.792023892,0.064818298
    EGIPR1094A,0.350242033,0.598746704,0.101737362
    PQTFK6203U,0.383561916,0.660697282,0.10642655
    UYLGX7811M,0.509668823,1.037205877,-0.017868231
    ICEHA2251J,0.643924109,1.452395674,-0.164547457
\end{filecontents*}

答案1

以下是一个代码,它遍历一个 csv 文件并添加这些“须”。垂直线的尺寸由/pgfplots/boxplot/box extend=0.1(和/pgfplots/boxplot/whisker extend) 决定。

\documentclass{article}
\usepackage{pgfplots}
\usepackage{tikzlings}%<- for the second part of the answer ;-)
\usetikzlibrary{shapes.callouts}%<- for the second part of the answer
\usepackage{pgfplotstable}
\usepackage{filecontents}
\begin{filecontents*}{whiskers.dat}
0.6 0.35 0.85
0.7 0.55 0.85
0.9 0.85 0.95
\end{filecontents*}

\pgfplotsset{compat=1.16}
\usepgfplotslibrary{statistics}
\newcounter{iloop}
\newcommand*{\ReadOutElement}[4]{%
    \pgfplotstablegetelem{#2}{[index]#3}\of{#1}%
    \let#4\pgfplotsretval
}

\begin{document}
\begin{tikzpicture}
\pgfplotstableread[header=false]{whiskers.dat}{\datatable}   
\pgfplotstablegetrowsof{\datatable}
\pgfmathtruncatemacro{\numrows}{\pgfplotsretval}
\setcounter{iloop}{1}
\edef\MyLabels{Group A}
\loop\stepcounter{iloop}\edef\MyLabels{\MyLabels,Group \Alph{iloop}}%
\ifnum\value{iloop}<\numrows\repeat
\begin{axis}[
    ytick={1,...,\numrows},
    yticklabels/.expanded=\MyLabels,
    /pgfplots/boxplot/box extend=0.1,
    /pgfplots/boxplot/whisker extend=%
    \pgfkeysvalueof{/pgfplots/boxplot/box extend}*4,
    ]
\pgfplotsinvokeforeach{0,...,\the\numexpr\numrows-1}
{
\ReadOutElement{\datatable}{#1}{0}{\Median}
\ReadOutElement{\datatable}{#1}{1}{\Upper}
\ReadOutElement{\datatable}{#1}{2}{\Lower}
\edef\temp{\noexpand\addplot+ [mark=*,
boxplot prepared={
lower whisker=\Lower, 
% lower quartile=,
median=\Median,
% upper quartile=, 
upper whisker=\Upper,
},
] coordinates {(#1+1,\Median)};
}
\temp
}

\end{axis}
\end{tikzpicture}
\bigskip\bigskip

\begin{tikzpicture}
\marmot[whiskers,teeth]
\node[ellipse callout, fill=white,
    draw,
 font=\sffamily,align=center,%inner sep=-2pt,
 callout relative pointer={(-150:0.9)}] at (2,2.4) {How about\\ my whiskers?};
\end{tikzpicture}
\end{document}

在此处输入图片描述

您的更大的数据文件也可以正常工作。

\documentclass{article}
\usepackage{pgfplots}
\usepackage{pgfplotstable}
\usepackage{filecontents}
\begin{filecontents*}{dotwhisker.csv}
    group,point_est,upper,lower
    RAMOF4414X,0.25858685,0.878367526,-0.161193826
    DOYDL6809E,0.555485848,1.400010767,-0.089039071
    YHYYL9849M,0.716888235,1.368127463,0.265649007
    JMQIG4509E,0.030860459,0.453325401,-0.191604482
    SQYRJ4202T,0.585824938,1.206768474,0.164881402
    OYSCJ0457U,0.601211178,1.362723257,0.0396991
    ZQAXZ5423I,0.385052008,0.892098437,0.078005579
    MDWEF7504F,0.390673629,1.063439815,-0.082092558
    IJUSO8696V,0.3871096,1.084017824,-0.109798624
    BQSBP8990Q,0.321992434,0.908002585,-0.064017717
    OCDWZ7363Q,0.40456146,0.981590932,0.027531988
    OHEEJ6147W,0.300322711,0.814591182,-0.01394576
    KCRLD3716J,0.344749221,0.875554365,0.013944077
    MFPJM1799W,0.512891763,1.177304879,0.048478648
    DIPEZ2121T,0.691898812,1.505729535,0.078068088
    LYOUI9349J,0.477942901,1.060060156,0.095825646
    VZFYT9091T,0.397798099,0.863976498,0.1316197
    OEQZM1870D,0.317760917,0.817961252,0.017560582
    VTGHR8934O,0.332565664,0.913827561,-0.048696233
    KMCVQ3734I,0.983167036,1.675403499,0.490930572
    ZALVK0344T,0.526817644,1.122164944,0.131470344
    PNXOL7981C,0.401862821,0.962308107,0.041417534
    PYIND4089E,0.411890426,0.88306747,0.140713381
    PNZTF5825U,0.591917888,1.201046391,0.182789386
    LPPRP3997H,0.361255951,1.008833148,-0.086321247
    VBUVF6797C,0.210359018,0.815565496,-0.194847461
    INHXV3841Z,0.614951144,1.31681457,0.113087718
    RPWRJ7560V,0.295268855,0.810041955,-0.019504244
    SCTCT3620G,0.49697234,1.070902402,0.123042278
    MAYYI9754J,0.31281028,0.774220754,0.051399806
\end{filecontents*}

\pgfplotsset{compat=1.16}
\usepgfplotslibrary{statistics}
\newcounter{iloop}
\newcommand*{\ReadOutElement}[4]{%
    \pgfplotstablegetelem{#2}{[index]#3}\of{#1}%
    \let#4\pgfplotsretval
}

\begin{document}
\begin{tikzpicture}
\pgfplotstableread[header=true,col sep=comma]{dotwhisker.csv}{\datatable}   
\pgfplotstablegetrowsof{\datatable}
\pgfmathtruncatemacro{\numrows}{\pgfplotsretval}
\ReadOutElement{\datatable}{0}{0}{\Label}
\edef\MyLabels{\Label}
\setcounter{iloop}{1}
\loop
\ReadOutElement{\datatable}{\number\value{iloop}}{0}{\Label}%
\edef\MyLabels{\MyLabels,\Label}%
\stepcounter{iloop}%
\ifnum\value{iloop}<\numrows\repeat
\begin{axis}[width=12cm,height=14cm,
    ytick={1,...,\numrows},
    yticklabels/.expanded=\MyLabels,
    /pgfplots/boxplot/box extend=0.1,
    /pgfplots/boxplot/whisker extend=%
    \pgfkeysvalueof{/pgfplots/boxplot/box extend}*4,
    ]
\pgfplotsinvokeforeach{0,...,\the\numexpr\numrows-1}
{
\ReadOutElement{\datatable}{#1}{1}{\Median}
\ReadOutElement{\datatable}{#1}{2}{\Upper}
\ReadOutElement{\datatable}{#1}{3}{\Lower}
\edef\temp{\noexpand\addplot+ [mark=*,
boxplot prepared={
lower whisker=\Lower, 
% lower quartile=,
median=\Median,
% upper quartile=, 
upper whisker=\Upper,
},
] coordinates {(#1+1,\Median)};
}
\temp
}

\end{axis}
\end{tikzpicture}
\end{document}

在此处输入图片描述

相关内容