[R] Estimate region of highest probabilty density

Spencer Graves spencer.graves at pdf.com
Sat Jun 17 05:01:26 CEST 2006


	  I have more sympathy than ideas for you, but I'll attempt to offer 
the latter as I haven't seen other replies to your post.

	  First, if you've got a grid for "contour", you can do various kinds 
of numerical integration.  If I just wanted a rough answer and I didn't 
have to sell it to anyone, I might compute as fine a grid as I could 
reasonably do, then cut the grid in two above and below my desired 
threshold, and do some crude trapezoidal integration in each, use the 
sum of the two to normalize and be done with it.

	  However, if I needed something that would stand closer scrutiny, I'd 
read what the MASS book says about this.  I'd also try "RSiteSearch", 
which led me to 
"http://finzi.psych.upenn.edu/R/Rhelp02a/archive/26898.html", among 
other things.

	  If you can figure out how to get the coefficients of the spline fit, 
then you can do the desired integration analytically.

	  I know this doesn't solve your problem, but I hope it helps.

	  Spencer Graves

Christoph wrote:
> Estimate region of highest probabilty density
> 
> Dear R-community
> 
> I have data consisting of x and y. To each pair (x,y) a z 
value (weight) is assigned. With kde2d I can estimate the
densities on a regular grid and based on this make a contour
plot (not considering the z-values). According to an earlier
post in the list I adjusted the kde2d to kde2d.weighted (see
code below) to estimate the densities of x and y considering
their weights z. There's also a piece of code with artificial
data.
> 
> Now my question: Does a function exist which calculates the 
value of the level corresponding to a certain percentage of
the volume (i.e. above this level e.g. 95% of the total volume
lie, the (parameter) region of highest probability density)?
> 
> And secondly: How is it in the case of n parameters, when I 
don't want to plot it anymore but estimate quantiles for each
parameter considering also the weight z (to each set of
parameters c(v,w,x,y) a weight z is assigned)?
> 
> Many thanks in advance, I am very grateful for any hint
> Chris
> 
> 
> 
> 
> # MASS::kde2d copied and modified
> # ===============================
> 
> library(MASS)
> 
> kde2d.weighted <- function (x, y, w, h, n = n, lims = c(range(x), range(y))) {
>   nx <- length(x)
>   if (length(y) != nx) 
>       stop("data vectors must be the same length")
>   gx <- seq(lims[1], lims[2], length = n) # gridpoints x
>   gy <- seq(lims[3], lims[4], length = n) # gridpoints y
>   if (missing(h)) 
>     h <- c(bandwidth.nrd(x), bandwidth.nrd(y));
>   if (missing(w)) 
>     w <- numeric(nx)+1;
>   h <- h/4
>   ax <- outer(gx, x, "-")/h[1] # distance of each point to each grid point in x-direction
>   ay <- outer(gy, y, "-")/h[2] # distance of each point to each grid point in y-direction
>   z <- (matrix(rep(w,n), nrow=n, ncol=nx, byrow=TRUE)*matrix(dnorm(ax), n, nx)) %*% t(matrix(dnorm(ay), n, nx))/(sum(w) * h[1] * h[2]) # z is the density
>   return(list(x = gx, y = gy, z = z))
> }
> 
> 
> # Generate artificial data
> # ========================
> 
> x <- runif(20000,-5,5)
> y <- runif(20000,-5,5)
> z <- dnorm(x, mean=0, sd=1)*dnorm(y, mean=0, sd=1)
> 
> # plot data
> # =========
> 
> library(Rcmdr)
> scatter3d(x=x,z=y,y=z,surface=FALSE,xlab="x",ylab="z",zlab="y",bg.col="black")
> 
> temp0 <- kde2d(x=x, y=y, n = 100, lims=c(range(x),range(y))) 
> contour(x=temp0$x, y=temp0$y, z=temp0$z, xlab="x", ylab="y", main="z")
> 
> temp <- kde2d.weighted(x=x, y=y, w=z, n = 100, lims=c(range(x),range(y))) 
> contour(x=temp$x, y=temp$y, z=temp$z, xlab="x", ylab="y", main="z", col="red", add=TRUE)
> 
> 
> 
> 
> 
> °°°
> Christoph Ort
> Eawag - aquatic research
> Environmental Engineering
> Überlandstrasse 133
> CH-8600 Dübendorf
> 
> phone: +41-44-823-5041          
> fax:   +41-44-823-5389
> cell:  +41-79-218-9242
>  
> mailto:christoph.ort at eawag.ch
> 
> http://www.eawag.ch/~ortchris/
> 
> http://www.eawag.ch 
> 
> 
> 
> 
> °°°
> Christoph Ort
> Eawag - aquatic research
> Environmental Engineering
> Überlandstrasse 133
> CH-8600 Dübendorf
> 
> phone: +41-44-823-5041          
> fax:   +41-44-823-5389
> cell:  +41-79-218-9242
>  
> mailto:christoph.ort at eawag.ch
> 
> http://www.eawag.ch/~ortchris/
> 
> http://www.eawag.ch
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



More information about the R-help mailing list