density(kernel = "cosine") .. the `wrong cosine' ..

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Wed, 1 Dec 1999 10:56:59 +0100 (MET)


I'm in teaching mode,  kernel densities.

{History:  density() was newly introduced in version 0.15,  19 Dec 1996;
	   most probably by Ross or Robert
}

When I was telling the students about different kernels (and why their
choice is not so important, and "equivalent bandwidths" etc,etc)
I wondered about the "Cosine" in my teaching notes which 
is defined there as

   k(x) = pi/4 * cos(pi/2 * x) *  I{ |x| <= 1 }
i.e. in R
  Kcos <- function(x) ifelse(abs(x) <= 1, pi/4 * cos(pi/2 * x), 0)

Now, R has instead (for bandwidth  h <- bw/1.135724  which makes the bandwidth
		    Gaussian equivalent;	     
		    here just h == 1/pi to be similar to above)
  
  Kcosine <- function(x) ifelse(abs(x) < 1, (1+cos(x*pi))/2 , 0)

I've looked in Dave Scott's (and Haerdle's "Smoothing... in S") book,
(Silverman doesn't mention any cosine kernel)
and both define the cosine kernel as I have it in my notes.

With above R code, look at

   x <- seq(-1.2,1.2,len=501)
   matplot(x, cbind(Kcos(x),Kcosine(x)), type='l', lty=1)

The big difference :

  - R's version is smooth (differentiable at the border of support)
  - Scott's (not really "his", of course!) version is not differentiable
    but looks much closer to the Epanechnikov kernel and is hence almost
    as `good' (less than half a percent of MSE loss w.r.t Epanechnikov).


Problem:

 - An average user knowing some statistics literature will most probably
   assume that a "cosine" kernel means the one in the literature, 
   *NOT* the one we have in R now.

Proposition / Possibilities / RFC [= Request For Comments] :

 - We CHANGE the behavior of  density(* , kernel="cosine")
   to use the cosine from the litterature.

 - provide the current "cosine" as  kernel = "smoothcosine"
   {I'd like to keep the possibility of 1-initial-letter abbreviation}


Enhancement (easy, I'll do that):

  - We further provide both
    Epanechnikov and "quartic" aka "biweight" additionally
    in any case.


Martin Maechler <maechler@stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO D10	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._