# [R] Calculate average of many subsets based on columns in another dataframe

William Dunlap wdunlap at tibco.com
Thu Feb 11 00:02:10 CET 2016

```You could try pulling some of the repeated subscripting operations,
especially the insertions, out of the loop.  E.g.,

values <- observations[,"values"];
date <- observations[,"date"] ;
groups\$average <- vapply(seq_len(NROW(groups)), function(i)
mean(values[date >= groups[i, "start"] & date <= groups[i, "end"]]),
FUN.VALUE=0)

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Feb 10, 2016 at 12:18 PM, Peter Lomas <peter.br.lomas at gmail.com>
wrote:

> Hello, I have a dataframe with a date range, and another dataframe
> with observations by date.  For each date range, I'd like to average
> the values within that range from the other dataframe.  I've provided
> code below doing what I would like, but using a for loop is too
> inefficient for my actual case (takes about an hour).  So I'm looking
> for a way to vectorize.
>
>
> set.seed(345)
> date.range <- seq(as.POSIXct("2015-01-01"),as.POSIXct("2015-06-01"),
> by="DSTday")
> observations <- data.frame(date=date.range, values=runif(152,1,100) )
> groups <- data.frame(start=sample(date.range[1:50], 20), end =
> sample(date.range[51:152], 20), average = NA)
>
> #Potential Solution (too inefficient)
>
> for(i in 1:NROW(groups)){
>  groups[i, "average"] <- mean(observations[observations\$date >=
> groups[i, "start"] & observations\$date <=groups[i, "end"], "values"])
> }
>
> As an extension to this, there will end up being multiple value
> columns, and each range will also identify which column to average.  I
> think if I can figure out the first problem I can try to extend it
> myself.
>
> Thanks,
> Peter
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help