[R] Moving average in a data table

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jun 26 00:07:26 CEST 2011


On Sat, Jun 25, 2011 at 3:15 PM, Roman Naumenko <roman at naumenko.ca> wrote:
> Hi,
>
> I'm trying to figure out common approach on calculating MA on a dataset that
> contains column "time".
>
> After digging around, I believe functions rollmean and rollaply should be
> used.
> However I don't quite understand the requirements for the underlying data.
> Should it be zoo object type? Formatted in a special way?
>
> As an example, I'm looking to get calculated avg=MA(variable) over 15 sec
> period on "time_sec" column:
>
>  date           variable   time_sec         avg
>  2011-05-17     132.55     11:29:59.579     132.55
>  2011-05-17     132.65     11:29:59.946     132.60
>  2011-05-17     132.5      11:29:59.946     132.57
>  2011-05-17     132.5      11:29:59.946     132.55
>  2011-05-17     132.55     11:29:59.946     132.55
>  2011-05-17     132.6      11:29:59.946     132.56
>  2011-05-17     132.55     11:29:59.946     132.56
>  2011-05-17     132.65     11:29:59.947     132.57
>  2011-05-17     132.85     11:30:00.45      132.60
>  2011-05-17     132.9      11:30:00.45      132.63
>  2011-05-17     133.05     11:30:00.45      132.67
>  2011-05-17     132.2      11:30:00.45      132.63
>  2011-05-17     132.5      11:30:00.45      132.62
>  2011-05-17     132.7      11:30:00.50      132.63
>  2011-05-17     132.75     11:30:00.57      132.63
>  2011-05-17     132.55     11:30:00.70      132.63
>  2011-05-17     132.25     11:30:00.70      132.61
>  2011-05-17     132.25     11:30:00.71      132.59
>  2011-05-17     132.35     11:30:00.173     132.57
>  2011-05-17     132.45     11:30:00.173     132.57
>

rollapply and rollmean are for fixed offsets such as 5 rows before and
after.  For this problem modify the following depending on your
precise requirements:

Lines <- "date           variable   time_sec         avg
 2011-05-17     132.55     11:29:59.579     132.55
 2011-05-17     132.65     11:29:59.946     132.60
 2011-05-17     132.5      11:29:59.946     132.57
 2011-05-17     132.5      11:29:59.946     132.55
 2011-05-17     132.55     11:29:59.946     132.55
 2011-05-17     132.6      11:29:59.946     132.56
 2011-05-17     132.55     11:29:59.946     132.56
 2011-05-17     132.65     11:29:59.947     132.57
 2011-05-17     132.85     11:30:00.45      132.60
 2011-05-17     132.9      11:30:00.45      132.63
 2011-05-17     133.05     11:30:00.45      132.67
 2011-05-17     132.2      11:30:00.45      132.63
 2011-05-17     132.5      11:30:00.45      132.62
 2011-05-17     132.7      11:30:00.50      132.63
 2011-05-17     132.75     11:30:00.57      132.63
 2011-05-17     132.55     11:30:00.70      132.63
 2011-05-17     132.25     11:30:00.70      132.61
 2011-05-17     132.25     11:30:00.71      132.59
 2011-05-17     132.35     11:30:00.173     132.57
 2011-05-17     132.45     11:30:00.173     132.57"

DF <- read.table(textConnection(Lines), header = TRUE)
DF <- transform(DF, datetime = as.POSIXct(paste(date, time_sec)))

f <- function(i) {
	is.near <- abs(as.numeric(DF$datetime[i] - DF$datetime)) < 7.5
	mean(DF$variable[is.near])
}
sapply(1:nrow(DF), f)



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list