[Rd] Long execution time for quantile() and difftime objects (PR#14091)

hong.ooi at anz.com hong.ooi at anz.com
Fri Nov 27 06:55:10 CET 2009


Full_Name: Hong Ooi
Version: 2.10.0
OS: Windows XP
Submission from: (NULL) (203.110.235.1)


While trying to get summary statistics on a duration variable (the difference
between a start and end date), I ran into the following issue. Using summary or
quantile (which summary calls) on a difftime object takes an extremely long time
if the object is even moderately large.

A reproducible example:

> x <- as.Date(1:10000, origin="1900-01-01")
> x[1:10]
 [1] "1900-01-02" "1900-01-03" "1900-01-04" "1900-01-05" "1900-01-06"
 [6] "1900-01-07" "1900-01-08" "1900-01-09" "1900-01-10" "1900-01-11"
> d <- x - as.Date("1900-01-01")
> d[1:10]
Time differences in days
 [1]  1  2  3  4  5  6  7  8  9 10
> system.time(summary(d[1:10]))
   user  system elapsed 
   0.01    0.00    0.01 
> system.time(summary(d[1:100]))
   user  system elapsed 
   0.21    0.00    0.20 
> system.time(summary(d[1:1000]))
   user  system elapsed 
   3.02    0.00    3.02 
> system.time(summary(d[1:10000]))
   user  system elapsed 
  43.56    0.04   43.66 


If I unclass d, there is no problem:

> system.time(summary(unclass(d[1:10000])))
   user  system elapsed 
      0       0       0 

Testing with Rprof() indicates that the problem lies in [.difftime, although the
code for that function seems innocuous enough.


> sessionInfo()
R version 2.10.0 (2009-10-26) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base



More information about the R-devel mailing list