[Rd] Couldn't (and shouldn't) is.unsorted() be faster?

Bill Dunlap bill at insightful.com
Thu Apr 17 21:26:12 CEST 2008


On Thu, 17 Apr 2008, Herve Pages wrote:

> Couldn't is.unsorted() bail out immediately here (after comparing
> the first 2 elements):
>
>  > x <- 20000000:1
>  > system.time(is.unsorted(x), gcFirst=TRUE)
>     user  system elapsed
>    0.084   0.040   0.124
>
>  > x <- 200000000:1
>  > system.time(is.unsorted(x), gcFirst=TRUE)
>     user  system elapsed
>    0.772   0.440   1.214

The C code does bail out upon seeing the first out- of-order pair, but
before calling the C code, the S code does any(is.na(x)), forcing a
scan of the entire data.  If you remove the is.na calls from
is.unsorted's S code you will see the timings improve in your example.
(It looks easy to do the NA checks in the C code.)

   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
       if (is.null(x))
           return(FALSE)
       if (!is.atomic(x))
           return(NA)
       .Internal(is.unsorted(x))
   }
   > x <- 20000000:1
   > system.time(is.unsorted(x), gcFirst=TRUE)
   user  system elapsed
     0.356   0.157   0.514
   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
   user  system elapsed
      0       0       0
   > revx <- rev(x)
   > system.time(is.unsorted(revx), gcFirst=TRUE)
      user  system elapsed
     0.500   0.170   0.672
   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
      user  system elapsed
     0.131   0.000   0.132

----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."



More information about the R-devel mailing list