AW: AW: [R] numericDeriv and ecdf

Khamenia, Valery V.Khamenia at BioVisioN.de
Mon Apr 28 10:18:37 CEST 2003


Dear Prof. Brian Ripley,

  first of all thank you for your answer, I do appreciate 
  how do you manage to keep successfully all your 
  activities and answer posts in this forum!

> An empirical CDF is a step function: it does not have a 
> derivative at the jump points, and has a zero 
> derivative everywhere else.

of course!

Let me add few words concerning my simple motivation. 

1. I need estimation for differential entropy.

2. I don't need estimation for differential entropy affected 
   by any smoothing kernels or other hypothesizes implicitly 
   coming as explained below.

3. Formula for differential entropy based on density

4. Density estimations based on real data are possible with 
   smoothing kernels only.

5. Application of smoothing kernels is not adequate if a priori 
   known that the family of distributions for my data is 
   extremely wide (Indeed, I don't need any extra hypothesizes 
   coming with smoothing kernels)

6. CDF is quite OK as "ascetic" estimation of distribution, i.e.
   CDF adds _nothing_ to (and removes _nothing_ from) the 
   hypothesizes about data distribution -- unlike those density 
   estimations based on smoothing kernels.

7. I don't know formula for differential entropy estimation 
   based on CDF.

8. Therefore I should try estimate differential entropy relying
   on density-based approach.   

9. Histogram is quite natural way for estimation of density.

10. Classical histograms are not adequate if a priori known 
   that family of distribution for my data is extremely wide. 
   Indeed:

   a) one should have some assumptions concerning the distribution
      in order to have reasonable breaks for binning.

   b) any binning reasonable in terms of histogram properties 
      tends to destroy knowledge about the distribution _within_ 
      a bin -- only a trivial histogram with breaks situated next 
      to the data points is really acceptable for keeping 
      knowledge about the distribution like ECDF does.

11. So we come to "empirical density", which is rather uncommon 
    term today. In order to feel my thoughts try please:

      x <- sort( rnorm(10000) )
      dx <- diff(x)
	ed <- 1/10000/dx
      plot(x[-1], ed, log="y") # my "empirical density"
      lines(x,dnorm(x),col=2)

Now I could have estimation for differential entropy like this:

      -sum(ed*log(ed)*dx)

That's it. 

> What is this function `numericDerivative': do you mean `numericDeriv'?

yes. Sorry, there is no auto-completion function in my non-emacs 
email client as in emacs' ESS environment ;-)

kind regards,
Valery A.Khamenya
---------------------------------------------------------------------------
Bioinformatics Department
BioVisioN AG, Hannover



More information about the R-help mailing list