[Rd] strptime(): on Linux system it seems to call system time?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Apr 1 10:38:12 CEST 2010


Let me lay this to rest.  For some reason the OP did not use a 
vectorized call to strptime but 100000 individual calls (as well as 
making *false* claims about what strptime does and what is 'completely 
unnecessary', and seemingly being igorant of system.time()).

I do not believe this is ever an issue for well-written R code.

Each time strptime() is called it needs to find and set the timezone 
(as whether an input is valid or not and whether it is in DST depends 
on the timezone).  If tz = "", the default, it needs to ask the system 
what the current timezone is via the C call tzset.  On well-written C 
runtimes tzset caches and so is fast after the first time.  On some 
others it reads files such as /etc/localtime each time.

On my Linux system (x86_64 Fedora 12)

system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F %H:%M:%S"))
    user  system elapsed
   1.048   0.222   2.086
system.time(strptime(rep("2010-03-10 17:00:00", 100000), "%F %H:%M:%S"))
    user  system elapsed
   0.371   0.184   0.579

whereas on my 2008 Mac laptop

    user  system elapsed
   7.402   0.015   7.441
    user  system elapsed
   6.689   0.013   6.716

and on my 2005 Windows laptop

    user  system elapsed
    2.47    0.00    2.47
    user  system elapsed
    1.39    0.00    1.40

(for which the credit is entirely due to the replacement code in R: 
Windows' datetime code is only used for strftime).

So looks like Apple could improve their POSIX datetime runtime, but 
I've never seen an R application where parsing dates took longer than 
reading the original posting (let alone the time taken to read some 
good books on how to time R code and write it efficiently).


On Thu, 1 Apr 2010, Patrick Connolly wrote:

> On Sat, 20-Mar-2010 at 06:54PM +0100, Peter Dalgaard wrote:
>
> [...]
>
> |> It seems to be  completely system-dependent. On Fedora 9, I see
> |>
> |>    user  system elapsed
> |>   2.890   0.314   3.374
> |>
> |> but on openSUSE 10.3 it is
> |>
> |>    user  system elapsed
> |>   3.924   6.992  10.917
> |>
> |> At any rate, I suspect that this is an issue with the operating system
> |> and its C libraries, not with R as such.
>
> Were those 32 or 64 bit?
>
> With Fedora 11 and AMD Athlon 2 Ghz, I get
>
>   user  system elapsed
>  1.395   0.294   1.885
>
> with Mepis 7 on a Celeron 1.6 Ghz,
>
>   user  system elapsed
>  3.890   5.896   9.845
>
> Both of those are 32 bit.
> Maybe 64 bit does things very differently.
>
>
>
> -- 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>   ___    Patrick Connolly
> {~._.~}                   Great minds discuss ideas
> _( Y )_  	         Average minds discuss events
> (:_~*~_:)                  Small minds discuss people
> (_)-(_)  	                      ..... Eleanor Roosevelt
>
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list