[R] Problem(?) in strptime()

Don MacQueen macq at llnl.gov
Mon Apr 8 21:33:38 CEST 2002


I think the following examples illustrate the crux of the matter 
(version and OS info are below).

The problem has to do with the transition from standard time to 
daylight savings time. My timezone, US/Pacific, has two parts: 
standard time (PST) 8 hours behind GMT and daylight savings time 
(PDT) 7 hours behind GMT. The transition takes place this year on 7 
April at 02:00, when 02:00 is re-labeled 03:00.

## April 6, 01:30 and 02:30
>  ISOdatetime(2002, 4, 6, 1:2, 30, 0,tz='GMT')
[1] "2002-04-05 17:30:00 PST" "2002-04-05 18:30:00 PST"

## April 7 , 01:30 and 02:30
>  ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
[1] "2002-04-06 17:30:00 PST" "2002-04-06 17:30:00 PST"

The dates supplied are one day apart. The times supplied are one hour 
apart. However, the times returned are one hour apart in the first 
case, but identical in the second case.

>  tmp <- ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
>  identical(tmp[1],tmp[2])
[1] TRUE

Of the four values returned, the last one is incorrect, because 
2002-4-7 2:30 GMT truly is 2002-4-6 18:30 PST.

What I need is a way to have that fourth case interpreted correctly.


Investigating a bit:

>  ISOdatetime
function (year, month, day, hour, min, sec, tz = "")
{
     x <- paste(year, month, day, hour, min, sec)
     as.POSIXct(strptime(x, "%Y %m %d %H %M %S"), tz = tz)
}

>  strptime
function (x, format)
.Internal(strptime(x, format))

ISOdatetime() uses strptime(), and strptime() does not use the 
timezone information. Indeed, from ?strptime, TZ as part of a format 
specification is available for output only.

As far as I can tell, strptime() interprets everything in the local 
timezone, and when provided a time such as 2002-4-7 2:30 that 
"doesn't exist" in the local timezone, makes a reasonable attempt to 
guess what the user meant. But it doesn't work for what ISOdatetime() 
does with it when tz is something other than ''.

What I need is a way to tell R that the date-time string really truly 
should be interpreted as GMT. I haven't found a way. (maybe setenv TZ 
GMT before starting R, but I'm still exploring that)

Also as far as I can tell, strptime() uses an OS-supplied strptime if 
one is available, and R is entirely dependent on its behavior. I 
don't entirely understand what man strptime on my system says about 
this, but maybe it suggests that timezone information might be used 
if provided...

      %Z    Timezone name or no characters if no time zone  infor-
            mation  exists.  Local timezone information is used as
            though  strptime()  called  tzset()  (see  ctime(3C)).
            Errors  may not be detected.  This behavior is subject
            to change in a future release.


>  Sys.getlocale()
[1] "C"
>  version

>  Sys.getenv('TZ')
           TZ
"US/Pacific"
        _
platform sparc-sun-solaris2.7
arch     sparc
os       solaris2.7
system   sparc, solaris2.7
status
major    1
minor    4.1
year     2002
month    01
day      30
language R

I tried changing the locale
    Sys.setlocale('LC_TIME','en_GB')
(based on entries in /usr/lib/locale/lcttab), and
    Sys.putenv('TZ=GMT')
to no avail.

----------------------------------------------
This whole thing is motivated by the fact that I am receiving some 
data that is time-stamped, and the time stamps (in addition to having 
a poorly chosen format) ignore the daylight savings time convention. 
That is, they always use an 8 hour offset from GMT. Thus, the three 
times shown are in fact an hourly sequence.

Sun Apr 07 01:30:58 2002
Sun Apr 07 02:30:58 2002
Sun Apr 07 03:30:58 2002

In order to convert these correctly to POSIXct, I thought a 
reasonable approach would be to tell R that they are in GMT, read 
them as such, and then convert to US/Pacific.

Here is what I have been using.

tmpd <- c('Sun Apr 07 01:30:58 2002',
           'Sun Apr 07 02:30:58 2002',
           'Sun Apr 07 03:30:58 2002')
tmpt <- as.POSIXct(strptime(tmpd,'%a %b %d %H:%M:%S %Y'),tz='GMT')+28800

>  tmpt
[1] "2002-04-07 01:30:58 PST" "2002-04-07 01:30:58 PST" "2002-04-07 
04:30:58 PDT"

It works for the first and last times, but not the middle one
(3:30 "PST" = 4:30 PDT is correct, but 2:30 "PST" should be 3:30 PDT).

I would appreciate help finding a way that works for all of them 
simultaneously.

Thanks
-Don


-----------------
Here are some more attempts at various ways of looking at these 
dates, if anyone cares to wade through them.

>  strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')
[1] "2002-04-07 01:30:00"
>  strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')
[1] "2002-04-07 01:30:00"
>  strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')
[1] "2002-04-07 03:30:00"

The first and last display as two hours apart.
The second one is interpreted by strptime() to be the same as the 
first one. Not unreasonable, but problematic as illustrated above.

-------
>  as.numeric(as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')))
[1] 1018171800
>  as.numeric(as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')))
[1] 1018171800
>  as.numeric(as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')))
[1] 101817540

>  1018175400 - 1018171800
[1] 3600

But in fact, the first and last are only one hour apart. This is 
correct, because the first one is PST, the third one is PDT.

-------
>  as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='GMT')
[1] "2002-04-06 17:30:00 PST"
>  as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='GMT')
[1] "2002-04-06 17:30:00 PST"
>  as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='GMT')
[1] "2002-04-06 19:30:00 PST"

>  as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
[1] "2002-04-07 01:30:00 PST"
>  as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
[1] "2002-04-07 01:30:00 PST"
>  as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
[1] "2002-04-07 03:30:00 PDT"

-- 
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
--------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list