[R] Simple 95% confidence interval for a median

peter dalgaard pdalgd at gmail.com
Thu May 12 19:51:19 CEST 2011


On May 12, 2011, at 18:33 , Greg Snow wrote:

> Contrary to the commonly held assumption, the Wilcoxin test does not deal with medians in general.
> 
> There are some specific cases/assumptions where the test/interval would apply to the median, if I remember correctly the assumptions include that the population distribution is symmetric and the only alternatives considered are shifts of the distribution (both assumptions that go contrary to what I would believe in most situations where I would want to use the Wilcoxin test).

Yes. Notice that the signed-rank Wilcoxon test does in fact assume symmetry under the null hypothesis, which does makes sense when looking at differences, but less so away from the null. 

As far as I remember, the pseudo median minimizes the absolute value of the signed-rank test statistic, but to be sure, read the reference on the help page.

> 
> If you want an actual confidence interval on the true meadian, then you either need to make some assumptions about the distribution that the data comes from, or use a tool like the bootstrap.

You can invert the binomial. Since 95 percent of the binomial distribution with p=.5, n=86 is between 35 and 52 you can generate a 95% CI for the median as sort(x)[c(34,53)].

There are a few demons lurking in the details, and it is easy be off-by-one, but you get the picture.

Try this

ci <- replicate(5000, {x<-rexp(86); sort(x)[c(34,53)] })
m <- qexp(.5)
ci <- ci[,order(apply(ci,2,sum))]
matplot(t(ci),pch=".")
abline(h=m)
sum(ci[1,]>m)
sum(ci[2,]<m)

(I get about 2% error in either direction, so slightly conservative. Taking c(35,52), I get 3% both ways, so I suppose I got the cutoff right. A bit earlier in the day and I might even be able to prove it...) 

BTW, I'm sure someone has improved on this with some sort of interpolation. 



> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow at imail.org
> 801.408.8111
> 
> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Georgina Imberger
>> Sent: Thursday, May 12, 2011 7:36 AM
>> To: r-help at r-project.org
>> Subject: [R] Simple 95% confidence interval for a median
>> 
>> Hi!
>> 
>> I have a data set of 86 values that are non-normally distributed
>> (counts).
>> 
>> The median value is 10. I want to get an estimate of the  95%
>> confidence
>> interval for this median value.
>> 
>> I tried to use a one-sample Wiolcoxin test:
>> 
>> wilcox.test(Comps,mu=10,conf.int=TRUE)
>> 
>> and got the following output:
>> 
>> Wilcoxon signed rank test with continuity correction
>> 
>> data:  Comps
>> V = 2111, p-value = 0.05846
>> alternative hypothesis: true location is not equal to 10
>> 95 percent confidence interval:
>> 10.00000 17.49993
>> sample estimates:
>> (pseudo)median
>>      12.50006
>> 
>> I wonder if someone would mind helping me out?
>> 
>> What am I doing wrong?
>> What is the '(psuedo)median'?
>> Can I get R to estimate the confidence around the actual median of 10?
>> 
>> With thanks,
>> Georgie
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list