[R] Kernel Density Estimation at manually specified points

Stephen Ellison Stephen.Ellison at lgcgroup.com
Tue Jun 28 13:34:16 CEST 2011


May be a kludge, but it might be simpler to write your own density function for a few specified points.

For example

my.density <- function(x, bw = 'nrd0', at) {
    x<-na.omit(x)
    
    #####
    #Borrowed from density.default for compatibility
    
    if (is.character(bw)) {
        if (length(x) < 2) 
            stop("need at least 2 points to select a bandwidth automatically")
        bw <- switch(tolower(bw), nrd0 = bw.nrd0(x), nrd = bw.nrd(x), 
            ucv = bw.ucv(x), bcv = bw.bcv(x), sj = , `sj-ste` = bw.SJ(x, 
                method = "ste"), `sj-dpi` = bw.SJ(x, method = "dpi"), 
            stop("unknown bandwidth rule"))
    }
    ######
    at <- matrix(at, ncol=1)
    y <- apply(at, 1, FUN=function(a, x, bw) sum(dnorm(a, x, bw)/length(x)), x=x, bw=bw )
    return(list(x=at, y=y))

}

x<-rnorm(500)
plot(density(x))
at<-seq(-2, 2, 0.25)
points(my.density(x, at=at))

 

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Carsten Harlaß
> Sent: 27 June 2011 20:19
> To: dcarlson at tamu.edu
> Cc: r-help at r-project.org
> Subject: Re: [R] Kernel Density Estimation at manually 
> specified points
> 
> Hello,
> 
> thanks for your response.
> 
> Of course I already thought about this "simple" solution of 
> the problem.
> But I think this is not a nice workaround.
> 
> If I understand the manual correctly, density() already makes 
> an approximation. But this approximation is just evaluated at 
> the wrong points.
> 
> See:
> 
> "Details
> 
> The algorithm used in density.default disperses the mass of 
> the empirical distribution function over a regular grid of at 
> least 512 points and then uses the fast Fourier transform to 
> convolve this approximation with a discretized version of the 
> kernel and then uses linear approximation to evaluate the 
> density at the specified points."
> 
> and
> 
> "n 	the number of equally spaced points at which the 
> density is to be
> estimated. When n > 512, it is rounded up to a power of 2 
> during the calculations (as fft is used) and the final result 
> is interpolated by approx. So it almost always makes sense to 
> specify n as a power of two. "
> 
> So this workaround means putting a second approximation on 
> top of an other approximation. That's not nice. Or is it?
> 
> With kind regards
> 
> Carsten
> 
> Am 27.06.2011 19:16, schrieb David L Carlson:
> > Look at ?approx. For your example (of course your random 
> numbers give 
> > different results):
> > 
> >> approx(f$x, f$y, c(-2, -1, 0, 1, 2))
> > $x
> > [1] -2 -1  0  1  2
> > 
> > $y
> > [1] 0.03757113 0.19007982 0.31941779 0.37066592 0.10227509
> > 
> > approx gives NA's if you try to interpolate outside the 
> bounds of the data.
> > ----------------------------------------------
> > David L Carlson
> > Associate Professor of Anthropology
> > Texas A&M University
> > College Station, TX 77843-4352
> > 
> > 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org 
> > [mailto:r-help-bounces at r-project.org] On Behalf Of Carsten Harlaß
> > Sent: Sunday, June 26, 2011 7:02 PM
> > To: r-help at r-project.org
> > Subject: [R] Kernel Density Estimation at manually specified points
> > 
> > Hello,
> > 
> > my name is Carsten. This ist my first post to R-help mailing list.
> > 
> > I estimate densities with the function "density" out of the package 
> > "stats".
> > 
> > A simplified example:
> > 
> > 	
> > 	#generation of test data
> > 	n=10
> > 	z = rnorm(n)
> > 	
> > 	#density estimation
> > 	f=density(z,kernel="epanechnikov",n=n)
> > 
> > 	#evaluation
> > 	print(f$y[5])
> > 
> > Here I can only evaluate the estimation at given points. 
> These points 
> > are determined by the parameter n. By default they are equidistant 
> > distributed on the interesting interval.
> > 
> > But I need to evaluate the estimation (the estimated densitiy 
> > function) at manually specified points. For example I want 
> to compute f(z[i]).
> > This means I am interested in the estimated density at a the 
> > observation z[i].
> > 
> > Does anyone know how I can compute this? I think this is an 
> ordinary 
> > task so I would be surprised if R can not manage this. But 
> even after 
> > a long search I have found nothing.
> > 
> > Thanks in advance
> > 
> > Carsten Harlaß
> > 
> > --
> > Carsten Harlaß
> > Aachen University of Applied Sciences
> > Campus Jülich
> > E-Mail: carsten_harlass at web.de
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> > --
> > Carsten Harlass
> > Aachen University of Applied Sciences
> > Campus Juelich
> > E-Mail: carsten_harlass at web.de
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> *******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list