[Rd] strptime(): on Linux system it seems to call system time?

Alexander Peterhansl APeterhansl at GAINCapital.com
Thu Apr 1 14:54:15 CEST 2010


Thanks for the two posts.

What if the timezone is set?  Then the issue of system calls for the
timezone falls away, no?
 
system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F
%H:%M:%S", tz="DST"))

Output on Linux Box (64-bit R 2.10.1 running on Intel Xeon E5520 @
2.27GHz):
   user  system elapsed 
 3.096    3.252    6.371

ORIGINAL
   user  system elapsed 
   3.33    8.941    12.273

This is does speed up things considerably, but I still don't know for
what all that system time is used?  If I can trace system calls, I will
follow up.

As far as vectorization is concerned, this example was meant as
reproducible "toy" code to illustrate an issue in a more complex,
non-"vectorizable" setup.

Alex




-----Original Message-----
From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
Sent: Thursday, April 01, 2010 4:38 AM
To: Patrick Connolly
Cc: Peter Dalgaard; r-devel at r-project.org; Alexander Peterhansl
Subject: Re: [Rd] strptime(): on Linux system it seems to call system
time?

Let me lay this to rest.  For some reason the OP did not use a 
vectorized call to strptime but 100000 individual calls (as well as 
making *false* claims about what strptime does and what is 'completely 
unnecessary', and seemingly being igorant of system.time()).

I do not believe this is ever an issue for well-written R code.

Each time strptime() is called it needs to find and set the timezone 
(as whether an input is valid or not and whether it is in DST depends 
on the timezone).  If tz = "", the default, it needs to ask the system 
what the current timezone is via the C call tzset.  On well-written C 
runtimes tzset caches and so is fast after the first time.  On some 
others it reads files such as /etc/localtime each time.

On my Linux system (x86_64 Fedora 12)

system.time(for (i in 1:100000) strptime("2010-03-10 17:00:00", "%F
%H:%M:%S"))
    user  system elapsed
   1.048   0.222   2.086
system.time(strptime(rep("2010-03-10 17:00:00", 100000), "%F %H:%M:%S"))
    user  system elapsed
   0.371   0.184   0.579

whereas on my 2008 Mac laptop

    user  system elapsed
   7.402   0.015   7.441
    user  system elapsed
   6.689   0.013   6.716

and on my 2005 Windows laptop

    user  system elapsed
    2.47    0.00    2.47
    user  system elapsed
    1.39    0.00    1.40

(for which the credit is entirely due to the replacement code in R: 
Windows' datetime code is only used for strftime).

So looks like Apple could improve their POSIX datetime runtime, but 
I've never seen an R application where parsing dates took longer than 
reading the original posting (let alone the time taken to read some 
good books on how to time R code and write it efficiently).


On Thu, 1 Apr 2010, Patrick Connolly wrote:

> On Sat, 20-Mar-2010 at 06:54PM +0100, Peter Dalgaard wrote:
>
> [...]
>
> |> It seems to be  completely system-dependent. On Fedora 9, I see
> |>
> |>    user  system elapsed
> |>   2.890   0.314   3.374
> |>
> |> but on openSUSE 10.3 it is
> |>
> |>    user  system elapsed
> |>   3.924   6.992  10.917
> |>
> |> At any rate, I suspect that this is an issue with the operating
system
> |> and its C libraries, not with R as such.
>
> Were those 32 or 64 bit?
>
> With Fedora 11 and AMD Athlon 2 Ghz, I get
>
>   user  system elapsed
>  1.395   0.294   1.885
>
> with Mepis 7 on a Celeron 1.6 Ghz,
>
>   user  system elapsed
>  3.890   5.896   9.845
>
> Both of those are 32 bit.
> Maybe 64 bit does things very differently.
>
>
>
> -- 
>
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>   ___    Patrick Connolly
> {~._.~}                   Great minds discuss ideas
> _( Y )_  	         Average minds discuss events
> (:_~*~_:)                  Small minds discuss people
> (_)-(_)  	                      ..... Eleanor Roosevelt
>
>
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list