[R] contour for plotting confidence interval on scatter plot of bivariate normal distribution

David Winsemius dwinsemius at comcast.net
Sat Mar 3 19:56:35 CET 2012


On Mar 3, 2012, at 12:34 PM, drflxms wrote:

> Thanx a lot Greg for the hint and letting me not alone with this!
>
> Tried ellipse and it works well. But I am looking for something more
> precise. The ellipse fits the real border to the nearest possible
> ellipse. I want the "real" contour, if possible.
>
> Meanwhile I found an interesting function named draw.contour
> http://quantitative-ecology.blogspot.com/2008/03/plotting- 
> contours.html
> provided by "Forester", an "Assistant Professor in the Department of
> Fisheries". - I love it how statistics brings people from completely
> different specialities together. - I am a neurologist.
>
> This function results not exactly in the same curve, I got with the  
> code
> I posted last, but is in deed very close by (parallel). So I think  
> both
> solutions are probably true and differ only because of rounding and
> method (..!?).
>
> Still I am not completely happy with
>
> 1. the numeric solution of setting to 68% to get 1SD. I'd prefer a
> symbolic, formula driven solution instead.
> 2. me not comprehending the code completely.
>
> While 2. will be hopefully solved soon by me delving into the code, 1
> remains...
>

You could take a look at answers to similar questions on StackOverflow:

http://stackoverflow.com/questions/6655268/ellipse-containing-percentage-of-given-points-in-r

http://stackoverflow.com/questions/7055848/plot-ellipse-bounding-a-percentage-of-points

http://stackoverflow.com/questions/7718569/how-to-plot-90-confidence-bands-with-locfit-in-r

http://stackoverflow.com/questions/7511889/ellipse-representing-horizontal-and-vertical-error-bars-with-r

Searching on "plot confidence ellipse" in the rhelp archives should be  
similarly productive.

There's also a non-parametric 2d boundary plotting method called a  
bagplot and I think it's implemented in a package by the same name.

-- 
David.

> Best, Felix
>
>
> Am 03.03.12 17:28, schrieb Greg Snow:
>> Look at the ellipse package (and the ellipse function in the package)
>> for a simple way of showing a confidence region for bivariate data on
>> a plot (a 68% confidence interval is about 1 SD if you just want to
>> show 1 SD).
>>
>> On Sat, Mar 3, 2012 at 7:54 AM, drflxms <drflxms at googlemail.com>  
>> wrote:
>>> Dear all,
>>>
>>> I created a bivariate normal distribution:
>>>
>>> set.seed(138813)
>>> n<-100
>>> x<-rnorm(n); y<-rnorm(n)
>>>
>>> and plotted a scatterplot of it:
>>>
>>> plot(x,y)
>>>
>>> Now I'd like to add the 2D-standard deviation.
>>>
>>> I found a thread regarding plotting arbitrary confidence  
>>> boundaries from
>>> Pascal Hänggi
>>> http://www.mail-archive.com/r-help@r-project.org/msg24013.html
>>> which cites the even older thread
>>> http://tolstoy.newcastle.edu.au/R/help/03b/5384.html
>>>
>>> As I am unfortunately only a very poor R programmer, the code of  
>>> Pascal
>>> Hänggi is a myth to me and I am not sure whether I was able to  
>>> translate
>>> the recommendation of Brain Ripley in the later thread (which  
>>> provides
>>> no code) into the the correct R code. Brain wrote:
>>>
>>> You need a 2D density estimate (e.g. kde2d in MASS) then compute the
>>> density values at the points and draw the contour of the density  
>>> which
>>> includes 95% of the points (at a level computed from the sorted  
>>> values
>>> via quantile()). [95% confidence interval was desired in thread  
>>> instead
>>> of standard deviation...]
>>>
>>> So I tried this...
>>>
>>> den<-kde2d(x, y, n=n) #as I chose n to be the same as during  
>>> creating
>>> the distributions x and y (see above), a z-value is assigned to  
>>> every
>>> combination of x and y.
>>>
>>> # create a sorted vector of z-values (instead of the matrix stored
>>> inside the den object
>>> den.z <-sort(den$z)
>>>
>>> # set desired confidence border to draw and store it in variable
>>> confidence.border <- quantile(den.z, probs=0.6827, na.rm = TRUE)
>>>
>>> # draw a line representing confidence.border on the existing  
>>> scatterplot
>>> par(new=TRUE)
>>> contour(den, levels=confidence.border, col = "red", add = TRUE)
>>>
>>> Unfortunately I doubt very much this is correct :( In fact I am sure
>>> this is wrong, because the border for probs=0.05 is drawn outside  
>>> the
>>> values.... So please help and check.
>>> Pascal Hänggis code seems to work, but I don't understand the  
>>> magic he
>>> does with
>>>
>>> pp <- array()
>>> for (i in 1:1000){
>>>       z.x <- max(which(den$x < x[i]))
>>>       z.y <- max(which(den$y < y[i]))
>>>       pp[i] <- den$z[z.x, z.y]
>>> }
>>>
>>> before doing the very same as I did above:
>>>
>>> confidencebound <- quantile(pp, 0.05, na.rm = TRUE)
>>>
>>> plot(x, y)
>>> contour(den, levels = confidencebound, col = "red", add = TRUE)
>>>
>>>
>>> My problems:
>>>
>>> 1.) setting probs=0.6827 is somehow a dirty trick which I can only  
>>> use
>>> by simply knowing that this is the percentage of values inside +-1sd
>>> when a distribution is normal. Is there a way doing this with  
>>> "native"
>>> sd function?
>>> sd(den.z) is not correct, as den.z is in contrast to x and y not  
>>> normal
>>> any more. So ecdf(den.z)(sd(den.z)) results in a percentile of  
>>> 0.5644 in
>>> this example instead of the desired 0.6827.
>>>
>>> 2.) I would like to have code that works with any desired  
>>> confidence.
>>> Unfortunately setting probs to the desired confidence would  
>>> probably be
>>> wrong (?!) as it relates to den.z instead of x and y, which are the
>>> underlying distributions I am interested in. To put it short I  
>>> want the
>>> confidence of x/y and not of den.z.
>>>
>>>
>>> I am really completely stuck. Please help me out of this! Felix
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list