[R] merge.zoo returns unmatched dates

arun smartpink111 at yahoo.com
Mon Oct 1 14:47:11 CEST 2012


HI,

You can also try this:
Vup<-read.table(text="
                Date, Velocity_m/s
 2010-01-21 07:42:00,    1.217943
 2010-01-21 07:43:00,    1.624395
 2010-01-21 07:44:00,    1.526379
 2010-01-21 07:45:00,    1.456831
 2010-01-21 07:46:00,    1.245390
 2010-01-21 07:47:00,    1.374330
",sep=",",header=TRUE,stringsAsFactors=FALSE)
 
PAS<-read.table(text="
                Date,       PAS
 2010-01-21 05:01:00,  0.0013938
 2010-01-21 05:02:00,  0.0015331
 2010-01-21 05:03:00,  0.0016725
 2010-01-21 05:04:00,  0.0016725
 2010-01-21 05:05:00,  0.0012265
 2010-01-21 05:06:00,  0.0015889
",sep=",",header=TRUE,stringsAsFactors=FALSE)

library(xts)
PAS$Date<-as.POSIXct(PAS$Date,format="%Y-%m-%d %H:%M:%S",tz="UTC")
Vup$Date<-as.POSIXct(Vup$Date,format="%Y-%m-%d %H:%M:%S",tz="UTC")
 Vupxt<-xts(Vup[,2],order.by=Vup[,1],tzone="UTC")
 PASxt<-xts(PAS[,2],order.by=PAS[,1],tzone="UTC")
 VUPPASxt<- merge(Vupxt,PASxt)
 VUPPASzoo<-zoo(VUPPASxt)
VUPPASzoo
#                       Vupxt     PASxt
#2010-01-21 05:01:00       NA 0.0013938
#2010-01-21 05:02:00       NA 0.0015331
#2010-01-21 05:03:00       NA 0.0016725
#2010-01-21 05:04:00       NA 0.0016725
#2010-01-21 05:05:00       NA 0.0012265
#2010-01-21 05:06:00       NA 0.0015889
#2010-01-21 07:42:00 1.217943        NA
#2010-01-21 07:43:00 1.624395        NA
#2010-01-21 07:44:00 1.526379        NA
#2010-01-21 07:45:00 1.456831        NA
#2010-01-21 07:46:00 1.245390        NA
#2010-01-21 07:47:00 1.374330        NA

str(VUPPASzoo)
#‘zoo’ series from 2010-01-21 05:01:00 to 2010-01-21 07:47:00
 # Data: num [1:12, 1:2] NA NA NA NA NA ...
 #- attr(*, "dimnames")=List of 2
  #..$ : chr [1:12] "2010-01-21 05:01:00" "2010-01-21 05:02:00" "2010-01-21 05:03:00" "2010-01-21 05:04:00" ...
  #..$ : chr [1:2] "Vupxt" "PASxt"
  #Index:  POSIXct[1:12], format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" ...


A.K.

 




----- Original Message -----
From: Vindoggy ! <vindoggy at hotmail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, October 1, 2012 2:29 AM
Subject: [R] merge.zoo returns unmatched dates


Sorry for the lack of reproducible data, but this seems to be a problem inherent to my dataset and I can't figure out where the issue is. 

I have several data frames set up as a time series with identical POSIXct date formats. If I keep the original data in data frame format and merge them using base merge- everything is perfect and everyone is happy.

If I transform the data frames to zoo objects, and then do a merge.zoo- the data seem to become uncoupled from the original data. Even more unusual is that some dates in the new merged data set  are prior to the original data set. I've attempted bellow to show what this looks like, and I hope someone has a suggestion as to what may be causing the problem.

Here is one set of data in data.frame format

head(Vup)
                 Date Velocity_m/s
1 2010-01-21 07:42:00     1.217943
2 2010-01-21 07:43:00     1.624395
3 2010-01-21 07:44:00     1.526379
4 2010-01-21 07:45:00     1.456831
5 2010-01-21 07:46:00     1.245390
6 2010-01-21 07:47:00     1.374330

str(Vup)
'data.frame':    7168 obs. of  2 variables:
$ Date        : POSIXct, format: "2010-01-21 07:42:00" "2010-01-21 07:43:00" ...
$ Velocity_m/s: num  1.22 1.62 1.53 1.46 1.25 ...

And here is a second in data.frame format:

head(PAS)
                 Date               PAS
1 2010-01-21 05:01:00   0.0013938
2 2010-01-21 05:02:00   0.0015331
3 2010-01-21 05:03:00   0.0016725
4 2010-01-21 05:04:00   0.0016725
5 2010-01-21 05:05:00   0.0012265
6 2010-01-21 05:06:00   0.0015889

str(PAS)
'data.frame':    5520 obs. of  2 variables:
$ Date       : POSIXct, format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" ...
$ PAS: num  0.00139 0.00153 0.00167 0.00167 0.00123 ...



Using zoo:

PASmin<-zoo(as.matrix(PAS[,2]),as.POSIXct(PAS[,1],format="%Y-%m-%d %H:%M:%S",tz="UTC"))

str(PASmin)
‘zoo’ series from 2010-01-21 05:01:00 to 2010-01-27 13:01:00
  Data: num [1:5520, 1] 0.00139 0.00153 0.00167 0.00167 0.00123 ...
- attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "PAS"
  Index:  POSIXct[1:5520], format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" "2010-01-21 05:03:00" ...




ADP_UPmin<-zoo(as.matrix(Vup[,2]),as.POSIXct(Vup[,1], format="%Y-%m-%d %H:%M",tz="UTC"))

str(ADP_UPmin)
‘zoo’ series from 2010-01-21 07:42:00 to 2010-01-26 20:12:00
  Data: num [1:7168, 1] 1.22 1.62 1.53 1.46 1.25 ...
- attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "UP_Velocity_m/s"
  Index:  POSIXct[1:7168], format: "2010-01-21 07:42:00" "2010-01-21 07:43:00" "2010-01-21 07:44:00" ...


And if I merge the two zoo objects I get this:

M<-merge(ADP_UPmin,PASmin)
head(M)

                    UP_Velocity_m/s       PAS
2010-01-20 21:01:00              NA 0.0013938
2010-01-20 21:02:00              NA 0.0015331
2010-01-20 21:03:00              NA 0.0016725
2010-01-20 21:04:00              NA 0.0016725
2010-01-20 21:05:00              NA 0.0012265
2010-01-20 21:06:00              NA 0.0015889


‘zoo’ series from 2010-01-20 21:01:00 to 2010-01-27 05:01:00
  Data: num [1:8499, 1:2] NA NA NA NA NA NA NA NA NA NA ...
- attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "UP_Velocity_m/s" "PAR"
  Index:  POSIXct[1:8499], format: "2010-01-20 21:01:00" "2010-01-20 21:02:00" "2010-01-20 21:03:00" ...


For some reason I can not figure out, even though both the PAS data frame and PAS zoo object starts at 2010-01-21 05:01:00, once merged the PAS data starts a day earlier at 2010-01-20 21:01:00.  The actual numeric data looks good, but both variables have no come uncoupled from the time series dates (The Velocity data is similarity uncoupled). And as stated before, doing an non-zoo merge on the data.frame data works fine.

Anyone got any ideas what's going on?


                          
    [[alternative HTML version deleted]]


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list