[Rd] sapply(Date, is.numeric)

Martin Maechler maechler at stat.math.ethz.ch
Thu Jul 31 10:13:45 CEST 2008


>>>>> "PBR" == Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>>     on Thu, 31 Jul 2008 08:36:22 +0100 (BST) writes:

    PBR> I've now committed fixes in R-patched and R-devel.
    PBR> There is one consequence: data.matrix() was testing for numeric columns by 
    PBR> unlist(lapply(x, is.numeric)) and so incorrectly treating Date and POSIXct 
    PBR> columns as numeric (which we had decided they were not).  This affects 
    PBR> package gvlma.

    PBR> data.matrix() is now working as documented, but as we have an exception 
    PBR> for factors, do we also want exceptions for Date and POSIXct?

Yes, that's a good idea, and much in the spirit of
data.matrix()
as I have understood it.

Note the following from  help(data.matrix)

where I think the 'Title' and 'Description' are more liberal
(rightly so) than 'Details' :

 >> Convert a Data Frame to a Numeric Matrix
 >> 
 >> Description:
 >> 
 >>      Return the matrix obtained by converting all the variables in a
 >>      data frame to numeric mode and then binding them together as the
 >>      columns of a matrix.  Factors and ordered factors are replaced by
 >>      their internal codes.

     [...........]

 >> Details:
 >> 
 >>      Supplying a data frame with columns which are not numeric, factor
 >>      or logical is an error.  A warning is given if any non-factor
 >>      column has a class, as then information can be lost.


Do we really have good reasons to give an error if a column is
not numeric (nor of the "exception class")?

Couldn't we just   lapply(., as.numeric)
and if that doesn't give errors 
just "be happy" ?

Martin


    PBR> On Wed, 30 Jul 2008, Martin Maechler wrote:

    >>>>>>> "BDR" == Prof Brian Ripley <ripley at stats.ox.ac.uk>
    >>>>>>> on Wed, 30 Jul 2008 13:29:38 +0100 (BST) writes:
    >> 
    BDR> On Wed, 30 Jul 2008, Martin Maechler wrote:
    >> >>>>>>> "RobMcG" == McGehee, Robert <Robert.McGehee at geodecapital.com>
    >> >>>>>>> on Tue, 29 Jul 2008 15:40:37 -0400 writes:
    >> >>
    RobMcG> FYI,
    RobMcG> I've tried posting the below message twice to the bug tracking system,
    >> >>
    >> >> [....... r-bugs problems discussed in a separate thread ....]
    >> >>
    >> >>
    >> >>
    RobMcG> R-developers,
    RobMcG> The results below are inconsistent. From the documentation for
    RobMcG> is.numeric, I expect FALSE in both cases.
    >> >>
    >> >> >> x <- data.frame(dt=Sys.Date())
    >> >> >> is.numeric(x$dt)
    RobMcG> [1] FALSE
    >> >> >> sapply(x, is.numeric)
    RobMcG> dt
    RobMcG> TRUE
    >> >>
    RobMcG> ## Yet, sapply seems aware of the Date class
    >> >> >> sapply(x, class)
    RobMcG> dt
    RobMcG> "Date"
    >> >>
    >> >> Yes, thanks a lot, Robert, for the report.
    >> >>
    >> >> That *is* a bug somewhere in the .Internal(lapply(...)) C code,
    >> >> when S3 dispatch of primitive functions should happen.
    >> 
    BDR> The bug is in do_is, which uses CHAR(PRINTNAME(CAR(call))), and when
    BDR> called from lapply that gives "FUN" not "is.numeric".  The root cause is
    BDR> the following comment
    >> 
    BDR> FUN = CADR(args);  /* must be unevaluated for use in e.g. bquote */
    >> 
    BDR> and hence that the function in the *call* passed to do_is can be
    BDR> unevaluated.
    >> 
    >> aah!  I see.
    >> 
    >> >> Here's an R scriptlet exposing a 2nd example
    >> >>
    >> >> ### lapply(list, FUN)
    >> >> ### ------------------ seems to sometimes fail for
    >> >> ### .Primitive S3-generic functions
    >> >>
    >> >> (ds <- seq(from=Sys.Date(), by=1, length=4))
    >> >> ##[1] "2008-07-30" "2008-07-31" "2008-08-01" "2008-08-02"
    >> >> ll <- list(d=ds)
    >> >> lapply(list(d=ds), round)
    >> >> ## -> Error in lapply(list(d = ds), round) : dispatch error
    >> 
    >> 
    BDR> And that's a separate issue, in DispatchGroup which states that arguments
    BDR> have been evaluated (true) but the 'call' from lapply gives the
    BDR> unevaluated arguments and so there is a mismatch.
    >> 
    >> yes, I too found that this was a separate issue, the latter
    >> one being new since version 2.7.0
    >> 
    BDR> I'm testing fixes for both.
    >> 
    >> Excellent!
    >> Martin
    >> 
    >> 
    >> >> ## or -- related to bug report by Robert McGehee on R-devel, on 2008-07-29:
    >> >> sapply(list(d=ds), is.numeric)
    >> >> ## TRUE
    >> >>
    >> >> ## in spite of
    >> >> is.numeric(`[[`(ll,1)) ## FALSE , because of
    >> >> is.numeric.date
    >> >>
    >> >> ## or
    >> >> round(`[[`(ll,1))
    >> >> ## [1] "2008-07-30" "2008-07-31" "2008-08-01" "2008-08-02"
    >> >>
    >> >> ##-----------------------------
    >> >>
    >> >> But I'm currently too much tied up with other duties,
    >> >> to find and test bug-fix.
    >> >>
    >> >> Martin Maechler, ETH Zurich and R-Core Team
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    PBR> -- 
    PBR> Brian D. Ripley,                  ripley at stats.ox.ac.uk
    PBR> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
    PBR> University of Oxford,             Tel:  +44 1865 272861 (self)
    PBR> 1 South Parks Road,                     +44 1865 272866 (PA)
    PBR> Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list