[R] Converting string to date

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sun Aug 4 21:59:58 CEST 2013


On Sun, 4 Aug 2013, Ron Michael wrote:

> Hi,
>  
> I want to convert following string to a Date format (mm/dd/yyyy):
>  
> MyString <- c("Sun Sep 01 00:00:00 EDT 2013", "Sun Dec 01 00:00:00 EST 2013")
>  
> Can somebody point me if it is possible to do that?

I think the answer to "is it possible" is a qualified yes... most things 
are possible if you limit your scope enough.

EST and EDT are part of an informal timezone identification system that is 
not standardized around the world, so your format is not strictly 
unambiguous. For example, EDT can refer to daylight savings time in zone 
-0500, or to daylight savings time in zone +1100. [1] You have to 
interpret the EST/EDT notation according to your local expectations, and 
be careful not to apply it to data that falls outside your local 
assumptions. Here, I assume you mean to handle data from the eastern area 
of the United States:

MyString <- c( "Sun Sep 01 00:00:00 EDT 2013"
              , "Sun Dec 01 00:00:00 EST 2013"
              , "Sun Dec 01 00:00:00 AST 2013" )

Sys.setenv( TZ="America/New_York" )

MyStringFixed <- MyString
MyStringFixed <- sub( 'EST', '-0500', MyStringFixed )
MyStringFixed <- sub( 'EDT', '-0400', MyStringFixed )
MyStringFixed <- sub( 'AST', '-0400', MyStringFixed )

# see ?strptime for format string definitions
MyDateTm <- as.POSIXct( MyStringFixed, format="%a %b %d %H:%M:%S %z %Y" )

MyDateStr <- as.character( MyDateTm, format="%m/%d/%Y" )

As my code shows, you also need to be aware that only one timezone can be 
in effect for a particular POSIXct vector... that is, although you may be 
using different timezones for each input string, they are all converted 
internally to UTC and displayed according to the tz attribute of the whole 
vector (in this case the empty string, which refers to the TZ environment 
variable). As a result, in the general case of arbitrary input timezones, 
when you convert to date format then the date that is shown may not be 
the same as the original "date" in the input strings (because of timezone 
differences).

If you want to cover your ears and eyes and say "Timezones don't exist" 
then you can simply strip out the timezone information from the input 
strings before you convert them:

MyStringFixed <- MyString
MyStringFixed <- sub( '[0-9][0-9]:[0-9][0-9]:[0-9][0-9] [^ ]+ '
                     , ''
                     , MyStringFixed )
MyDateTm <- as.POSIXct( MyStringFixed, format="%a %b %d %Y" )
MyDateStr <- as.character( MyDateTm, format="%m/%d/%Y" )

---

[1] http://www.timeanddate.com/library/abbreviations/timezones/

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------


More information about the R-help mailing list