# [R] confidence / prediction ellipse

Michael Friendly friendly at yorku.ca
Sun Feb 10 01:29:20 CET 2013

```Hi Giuseppe
You've posted a series of questions on this topic, never with any code
you've tried or data, and usually with some undefined references to
'a' and 'b'.  In spite of this, a variety of people have tried to
give you helpful replies, intuiting what it is you might have meant.

This is wasteful of everyone's time. If you want help from R-help,
code and data. Otherwise, you might post to
R-mindreaders if it ever gets established.

On 2/7/2013 11:20 AM, Giuseppe Amatulli wrote:
> Hi Rolf,
> sorry for this late answer and thanks for your kind explanation and
> relative R code. I really appreciate.
> In reality the concept that I'm trying to address is a bit more complex.
> I'm fitting a model y vs 6 predictors with MARS / RandomForest /
> Multiple Linear Regression Models having 140 observations.
> I have the prediction of each model and would like to delineate the
> prediction ellipses for 3 models, for the 95% probability, and
> plotting them together with the observation vs prediction.
> I think that the prediction-ellipses code that you provide to me is
> valid also for predictions derived by not-linear model (such as MARS
> and RF).
> Is it correct? or should i use an alternative solution ?
>
> Moreover, I was expecting that the  abline (lm(b,a)) would be
> correspond to the main axis of the prediction ellipse, but is not this
> the case.
> why?
>
> Giuseppe
>
> On 28 January 2013 19:04, Rolf Turner <rolf.turner at xtra.co.nz> wrote:
>>
>> I believe that the value of "radius" that you are using is incorrect. If you
>> have a data
>> matrix X whose columns  are jointly distributed N(mu,Sigma) then a
>> confidence
>> ellipse for mu is determined by
>>
>>      n * (x - Xbar)' S^{-1}(x - Xbar) ~ T^2
>>
>> where Xbar is the mean vector for X and S is the sample covariance matrix,
>> and where "T^2" means Hotelling's T-squared distribution, which is equal to
>>
>>      (n-1)*2/(n-2) * F_{2,n-2}
>>
>> the latter representing the F distribution on 2 and n-2 degrees of freedom.
>>
>>
>>      radius <- sqrt(2 * (npts-1) * qf(0.95, 2, npts-2)/(npts*(npts-2)))
>>
>> where npts <- length(a).  Note that it is qf(0.95,2,npts-2) and *NOT*
>> qf(0.95,2,npts-1).
>>
>> To get the corresponding *prediction* ellipse simply multiply the foregoing
>> radius by sqrt(npts+1).  By "prediction ellipse" I mean an ellipse such that
>> the probability that a new independent observation from the same population
>> will fall in that ellipse is the given probability (e.g. 0.95). Note that
>> this does
>> not mean that 95% of the data will fall in the calculated ellipse (basically
>> because
>> of the *dependence* between S and the individual observations).
>>
>> These confidence and prediction ellipses are (I'm pretty sure) valid under
>> the assumption that the data are (two dimensional, independent) Gaussian,
>> and that you use the sample covariance and sample mean as "shape" and
>> "centre".  I don't know what impact your robustification procedure of using
>> cov.trob() will/would have on the properties of these ellipses.
>>
>> A script which does the ellipses for your toy data, using the sample
>> covariance
>> and sample mean (rather than output from cov.trob()) is as follows:
>>
>> #
>> # Script scr.amatulli
>> #
>>
>> require(car)
>> a <- c(12,12,4,5,63,63,23)
>> b <- c(13,15,7,10,73,83,43)
>> npts   <- length(a)
>> shape  <- var(cbind(a, b))
>> center <- c(mean(a),mean(b))
>> rconf  <- sqrt(2 * (npts-1) * qf(0.95, 2, npts-2)/(npts*(npts-2)))
>> rpred  <- sqrt(npts+1)*rconf
>>
>> conf.elip <- ellipse(center, shape, rconf,draw = FALSE)
>> pred.elip <- ellipse(center, shape, rpred,draw = FALSE)
>> plot(pred.elip, type='l')
>> points(a,b)
>> lines(conf.elip,col="red")
>>
>>      cheers,
>>
>>          Rolf Turner
>>
>>
>> On 01/27/2013 10:12 AM, Giuseppe Amatulli wrote:
>>>
>>> Hi,
>>> I'm using the R library(car) to draw confidence/prediction ellipses in a
>>> scatterplot.
>>> >From what i understood  the ellipse() function return an ellipse based
>>> parameters:  shape, center,  radius .
>>> If i read  dataEllipse() function i can see how these parameters are
>>> calculated for a confidence ellipse.
>>>
>>> ibrary(car)
>>>
>>> a=c(12,12,4,5,63,63,23)
>>> b=c(13,15,7,10,73,83,43)
>>>
>>> v <- cov.trob(cbind(a, b))
>>> shape <- v\$cov
>>> center <- v\$center
>>>
>>> radius <- sqrt(2 * qf(0.95, 2, length(a) - 1))   # radius <- sqrt(dfn *
>>> qf(level, dfn, dfd))
>>>
>>> conf.elip = ellipse(center, shape, radius,draw = F)
>>> plot(conf.elip, type='l')
>>> points(a,b)
>>>
>>> My question is how I can calculate shape, center and radius  to obtain a
>>> prediction ellipses rather than a confidence ellipse?
>>> Giuseppe
>>>
>>
>
>
>
> --
> Giuseppe Amatulli
> Web: www.spatial-ecology.net
>

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca