[R] aggregate.zoo on bivariate data

Gabor Grothendieck ggrothendieck at gmail.com
Mon Aug 8 19:44:30 CEST 2011


On Mon, Aug 8, 2011 at 9:16 AM, Johannes Egner <johannes.egner at gmail.com> wrote:
> Hi,
>
> I'm removing non-unique time indices in a zoo time series by means of
> aggregate. The time series is bivariate, and the row to be kept only depends
> on the maximum of one of the two columns. Here's an example:
>
> x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ),
>        order.by=c(1,1,2,2))
>
> The eventual aggregated result should be
>
> 1   1.1   0.9
> 2   1.1   1.1
>
> that is, in each slice of the underlying data (a slice being all rows with
> the same time stamp), we take the row that has maximum value in the first
> column. (For the moment, let's not worry about several rows within the same
> slice having the same maximum value in the first column.)
>
> I have tried subsetting x by
>
> slices <- aggregate(x[,1], by=identity, FUN=which.max)
>
> but ended up with something as ugly as:
>
> T <- length( unique(time(x)) )
> result <- zoo( matrix(NA, ncol=2, nrow=T), order.by=unique(time(x)) )
>
> for(t in seq(length.out=T))
> {
>    result[t,] <- x[ time(x)==time(slices[t]) ][coredata(slices[t]),]
>
> }
>
> There must be a better way of doing this -- maybe using tapply or the plyr
> package, but possibly something much simpler. Any pointers are very welcome.

Where does the data come from in the first place?  Is it being read
in?  or is it in a data frame that is converted to a zoo object?

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list