[Rd] R: ecdf - linear

Martin Maechler maechler at stat.math.ethz.ch
Wed Jan 18 15:02:14 CET 2006


I'm replying to R-devel, the mailing list which should be used
to discuss R feature enhancements.

>>>>> "Norman" == Norman Warthmann <norman at warthmann.com>
>>>>>     on Wed, 18 Jan 2006 11:33:22 +0100 writes:

    Norman> .......... 

    Norman> Is there a specific reason why in the ecdf-function
    Norman> the variable method="constant" is hard-coded?
yes, see below

    Norman> I would like to use method="linear" and I have created
    Norman> a new function based on yours just changing this and
    Norman> it seems to work. I am now wondering whether you did
    Norman> that on purpose? Maybe because there is problems
    Norman> that are not obvious? If there aren't I would like
    Norman> to put in a feature request to include the "method"-
    Norman> argument into ecdf.

It can't be the way you did it:

The class "ecdf" inherits from class "stepfun" which is defined
to be "Step functions" and a step function *is* piecewise
constant (also every definition of ecdf in math/statistics
only uses a piecewise constant function).

Of course, it does make sense in some contexts to linearly
(or even "smoothly") interpolate an ecdf, one important context
being versions of "smoothed bootstrap", but the result is not a
proper ecdf anymore. 

I think you should rather define a function that takes an ecdf
(of class "ecdf" from R) as input
and returns a piecewise linear function {resulting from
approxfun() as in your example below}. However that result  may
*NOT* inherit from "ecdf" (nor "stepfun").

And for that reason {returning a different class}, this
extension should NOT become part of ecdf() itself.

If you write such a "ecdf -> interpolated_ecdf" transforming
function, it might be useful to include in the ecdf() help page
later, so "keep us posted".

Regards,
Martin Maechler, ETH Zurich



    Norman> my changed function:

    N>>   ecdf_linear<-function (x)
    N>>   {
    N>>        x <- sort(x)
    N>>        n <- length(x)
    N>>        if (n < 1)
    N>> 	   stop("'x' must have 1 or more non-missing values")
    N>>        vals <- sort(unique(x))
    N>>        rval <- approxfun(vals, cumsum(tabulate(match(x,vals)))/n,  
    N>>   method = "linear", yleft = 0, yright = 1, f = 0,ties = "ordered")
    N>>        class(rval) <- c("ecdf", "stepfun", class(rval))
    N>>        attr(rval, "call") <- sys.call()
    N>>        rval
    N>>   }

    N>>   test<-c(1,2,7,8,9,10,10,10,12,13,13,13,14)
    N>>   constant<-ecdf(test)
    N>>   linear<- ecdf_linear(test)
    N>>   plot(constant(1:14),type="b")
    N>>   points(linear(1:14),type="b",col="red")



More information about the R-devel mailing list