[R] more dates and data frames

Gabor Grothendieck ggrothendieck at gmail.com
Tue Jun 8 23:12:14 CEST 2010


Once again my message got held up for moderator approval so I
am deleting it and trying again. Hopefully this one goes through.

In general, we will get the simplest usage if we match the problem to
the appropriate OO class. In this case we are using time series so it
is advantageous to use a time series class, i.e. zoo, instead of data
frames.   We can use data frames but then each time we run into a
problem that would be trivial with time series we have to reinvent the
wheel all over again.

We read the data into a data frame, append a column of ones and then
read it into zoo, converting the index to Date class with the
indicated format, splitting it on column 2 and aggregating using sum
(since unlike the prior example we now have duplicate dates within cat
and also within dog).  See ?read.zoo for more.

To fill in the dates we just convert the zoo series to ts and back
again.  This loses the Date class (since ts has no notion of index
class) but we can put it back again.  Since this fills the newly added
entries with NAs we replace the NAs with zeros.

Lines <- "V1 V2
1  1/1/2000  dog
2  1/1/2000  cat
3  1/1/2000 tree
4  1/1/2000  dog
5  1/2/2000  cat
6  1/2/2000  cat
7  1/2/2000  cat
8  1/2/2000 tree
9  1/3/2000  dog
10 1/3/2000 tree
11 1/6/2000  dog
12 1/6/2000  cat"

library(zoo)
source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/read.zoo.R?revision=719&root=zoo")
DF <- read.table(textConnection(Lines))
z <- read.zoo(cbind(DF, 1), format = "%m/%d/%Y", split = 2, aggregate = sum)
zz <- as.zoo(as.ts(z))
time(zz) <- as.Date(time(zz))
zz[is.na(zz)] <- 0
zz

plot(zz)



More information about the R-help mailing list