[R] lubridate:ymd_hm and coercion of class POSIXct. Smooth way to restore the date format.

Henrik Pärn henrik.parn at bio.ntnu.no
Fri Mar 30 16:22:14 CEST 2012


Dear all,

I wish to create a POSIXct variable from date and time variables using the ymd_hm function in package lubridate. In some cases data for time is missing, which causes a problem for ymd_hm. I wish to find a smooth way to handle this.


# Some example data:
x <- data.frame(date =  c("2011-09-22", "2011-07-28"), time = c("15:00", NA))
x

# paste date and time together to a format that ymd_hm recognizes
x$datetime <- with(x, ifelse(!is.na(time), paste(date, time, sep = " "), NA))
x

# try ymd_hm on rows with complete data on date and time.
# install.packages("lubridate")
# library(lubridate)
ymd_hm(x$datetime[!is.na(x$time)])
#Looks OK

# try on the whole data frame:
ymd_hm(x$datetime)
# missing data causes error. Fair enough.

# try to allocate the result to the data frame on rows with complete data
x$datetime2[!is.na(x$time)] <- ymd_hm(x$datetime[!is.na(x$time)])
x

class(ymd_hm(x$datetime[!is.na(x$time)]))
class(x$datetime2)
# POSIXct is coerced to numeric. Fair enough

# try to pre-allocate a vector and convert it to class POSIXct:
x$datetime3 <- NA
class(x$datetime3) <- "POSIXct"
x

x$datetime3[!is.na(x$time)] <- ymd_hm(x$datetime[!is.na(x$time)])
x

# something happened with the time (15:00 vs 17:00). A time zone issue?
tz(x$datetime3[!is.na(x$time)])
Sys.timezone()

x$datetime3 <- with_tz(x$datetime3, "CEST")
x
# warnings, but correct time

# another 'random' try
x$datetime3 <- NA
x
class(x$datetime3) <- "POSIXct"
x$datetime3[!is.na(x$time)] <- ymd_hm(x$datetime[!is.na(x$time)])
x

tz(x$datetime3)
x$datetime3 <- with_tz(x$datetime3, "UTC")  
x

# works, but I don't understand why.
#------------------------------------------

sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252  LC_CTYPE=Norwegian (Bokmål)_Norway.1252    LC_MONETARY=Norwegian (Bokmål)_Norway.1252 LC_NUMERIC=C                              
[5] LC_TIME=Norwegian (Bokmål)_Norway.1252    

attached base packages:
[1] grDevices datasets  splines   graphics  stats     tcltk     utils     methods   base     

other attached packages:
[1] lubridate_1.1.0  svSocket_0.9-51  TinnR_1.0.3      R2HTML_2.2       Hmisc_3.9-2      survival_2.36-12

loaded via a namespace (and not attached):
[1] cluster_1.14.2 grid_2.14.2    lattice_0.20-0 plyr_1.7.1     stringr_0.6    svMisc_0.9-63  tools_2.14.2

Windows 7
#------------------------------------------

I am sure I am missing something. Can anyone point me to a smooth way to handle the time zones (if that is *the* problem here...)?

Thanks a lot in advance for taking your time. 


Best regards,


Henrik

--

Henrik Pärn

Centre for Conservation Biology

Department of Biology

Norwegian University of Science and Technology

NO-7491 Trondheim

NORWAY



Office: +47 735 96084   

Mobile: +47 909 89 255

Fax: +47 735 96100





More information about the R-help mailing list