[Rd] POSIXlt matching bug

Martin Maechler maechler at stat.math.ethz.ch
Fri Jul 2 17:05:33 CEST 2010


>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>     on Fri, 2 Jul 2010 12:22:07 +0200 writes:

>>>>> "RobMcG" == McGehee, Robert <Robert.McGehee at geodecapital.com>
>>>>>     on Tue, 29 Jun 2010 10:46:06 -0400 writes:

    RobMcG> I came across the below mis-feature/bug using match with POSIXlt objects
    RobMcG> (from strptime) in R 2.11.1 (though this appears to be an old issue).

    >>> x <- as.POSIXlt(Sys.Date())
    >>> table <- as.POSIXlt(Sys.Date()+0:5)
    >>> length(x)
    RobMcG> [1] 1
    >>> x %in% table  # I expect TRUE
    RobMcG> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    >>> match(x, table) # I expect 1
    RobMcG> [1] NA NA NA NA NA NA NA NA NA

    RobMcG> This behavior seemed more plausible when the length of a POSIXlt object
    RobMcG> was 9 (back in the day), however since the length was redefined, the
    RobMcG> length of x no longer matches the length of the match function output,
    RobMcG> as specified by the ?match documentation: "A vector of the same length
    RobMcG> as 'x'".

    RobMcG> I would normally suggest that we add a POSIXlt method for match that
    RobMcG> converts x into POSIXct or character first. However, match does not
    RobMcG> appear to be generic. Below is a possible rewrite of match that appears
    RobMcG> to work as desired.

    RobMcG> match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)

    RobMcG> .Internal(match(if(is.factor(x)||inherits(x, "POSIXlt"))
    RobMcG> as.character(x) else x,
    RobMcG> if(is.factor(table)||inherits(table, "POSIXlt"))
    RobMcG> as.character(table) else table,
    RobMcG> nomatch, incomparables))

    RobMcG> That said, I understand some people may be very sensitive to the speed
    RobMcG> of the match function, 

    MM> yes, indeed. 

    MM> I'm currently investigating an alternative, considerably more
    MM> programming time, but in the end should loose much less speed,
    MM> is to  .Internal()ize the tests in C code,
    MM> so that the resulting R code would simply be

    MM> match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)
    MM> .Internal(x, table, nomatch, incomparables)

I have committed such a change to  R-devel, to be 2.12.x.
This should mean that  match() actually is now very slightly
faster than it used to be.
The speed gain may not be measurable though.

Martin Maechler,  ETH Zurich



    RobMcG> and may prefer a simple change to the ?match
    RobMcG> documentation noting this (odd) behavior for POSIXlt. 

    RobMcG> Thanks, Robert

    RobMcG> R.version
    RobMcG> _                            
    RobMcG> platform       x86_64-unknown-linux-gnu     
    RobMcG> arch           x86_64                       
    RobMcG> os             linux-gnu                    
    RobMcG> system         x86_64, linux-gnu            
    RobMcG> status                                      
    RobMcG> major          2                            
    RobMcG> minor          11.1                         
    RobMcG> year           2010                         
    RobMcG> month          05                           
    RobMcG> day            31                           
    RobMcG> svn rev        52157                        
    RobMcG> language       R                            
    RobMcG> version.string R version 2.11.1 (2010-05-31)

    RobMcG> Robert McGehee, CFA
    RobMcG> Geode Capital Management, LLC
    RobMcG> One Post Office Square, 28th Floor | Boston, MA | 02109
    RobMcG> Tel: 617/392-8396    Fax:617/476-6389
    RobMcG> mailto:robert.mcgehee at geodecapital.com



More information about the R-devel mailing list