[R] contour for plotting confidence interval on scatter plot of bivariate normal distribution

Sat Mar 3 20:25:06 CET 2012

Thank you very much for your thoughts!

Exactly what you mention is, what I am thinking about during the last
hours: What is the relation between the den$z distribution and the z
distribution.
That's why I asked for ecdf(distribution)(value)->percentile earlier
this day (thank you again for your quick and insightful answer on
that!). I used it to compare certain values in both distributions by
their percentile.

I really think you are completely right: I urgently need some lessons in
bivariate/multivariate normal distributions. (I am a neurologist and
unfortunately did not learn too much about statistics in university :-()
I'll take your statement as a starter:

"Once you go into two dimensions, SD loses all meaning, and adding
nonparametric density estimation into the mix doesn't help, so just stop
thinking in those terms!"

This makes me really think a lot! Is plotting the 0,68 confidence
interval in 2D as equivalent to +-1 SD really nonsense!?

By the way: all started very harmless. I was asked to draw an example of
the well known target analogy for accuracy and precision based on "real"
(=simulated) data. (see i.e.
http://en.wikipedia.org/wiki/Accuracy_and_precision for a simple hand
made 2d graphic).

Well, I did by

set.seed(138813)
x<-rnorm(n); y<-rnorm(n)
plot(x,y)

I was asked whether it might be possible to add a histogram with
superimposed normal curve to the drawing: no problem. "And where is the
standard deviation", well abline(v=sd(... OK.

Then I realized, that this is of course only true for one of the
distributions (x) and only in one "slice" of the scatterplot of x and y.
The real thing is is a 3d density map above the scatterplot. A very nice
example of this is demo(bivar) in the rgl package (for a picture see i.e
http://rgl.neoscientists.org/gallery.shtml right upper corner).

Great! But how to correctly draw the standard deviation boundaries for
the "shots on the target" (the scatterplot of x and y)...

I'd be grateful for hints on what to read on that matter (book, website
etc.)

Greetings from Munich, Felix.

Am 03.03.12 19:22, schrieb peter dalgaard:
> 
> On Mar 3, 2012, at 17:01 , drflxms wrote:
> 
>> # this is the critical block, which I still do not comprehend in detail
>> z <- array()
>> for (i in 1:n){
>>        z.x <- max(which(den$x < x[i]))
>>        z.y <- max(which(den$y < y[i]))
>>        z[i] <- den$z[z.x, z.y]
>> }
> 
> As far as I can tell, the point is to get at density values corresponding to the values of (x,y) that you actually have in your sample, as opposed to den$z which is for an extended grid of all possible (x_i, y_j) combinations. 
> 
> It's unclear to me what happens if you look at quantiles for the entire den$z. I kind of suspect that it is some sort of approximate numerical integration, but maybe not of the right thing.... 
> 
> Re SD: Once you go into two dimensions, SD loses all meaning, and adding nonparametric density estimation into the mix doesn't help, so just stop thinking in those terms! 
>