[R] Calculate average of many subsets based on columns in another dataframe

Peter Lomas peter.br.lomas at gmail.com
Wed Feb 10 21:18:36 CET 2016


Hello, I have a dataframe with a date range, and another dataframe
with observations by date.  For each date range, I'd like to average
the values within that range from the other dataframe.  I've provided
code below doing what I would like, but using a for loop is too
inefficient for my actual case (takes about an hour).  So I'm looking
for a way to vectorize.


set.seed(345)
date.range <- seq(as.POSIXct("2015-01-01"),as.POSIXct("2015-06-01"),
by="DSTday")
observations <- data.frame(date=date.range, values=runif(152,1,100) )
groups <- data.frame(start=sample(date.range[1:50], 20), end =
sample(date.range[51:152], 20), average = NA)

#Potential Solution (too inefficient)

for(i in 1:NROW(groups)){
 groups[i, "average"] <- mean(observations[observations$date >=
groups[i, "start"] & observations$date <=groups[i, "end"], "values"])
}

As an extension to this, there will end up being multiple value
columns, and each range will also identify which column to average.  I
think if I can figure out the first problem I can try to extend it
myself.

Thanks,
Peter



More information about the R-help mailing list