[R] Take average of previous weeks

Bert Gunter bgunter.4567 at gmail.com
Mon Mar 26 16:37:06 CEST 2018


The row means **are** new variables and can be put wherever you like.
But all columns in a data frame **must** have the same number of rows,
so you'll have to fill in missing values as appropriate. That's where
you need to "adjust as necessary to your needs."

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Mar 26, 2018 at 6:22 AM, Miluji Sb <milujisb at gmail.com> wrote:
> Dear Bert,
>
> Thank you very much.This works. I was wondering if the fact that I want to
> create new variables (sorry for not stating that fact) makes any difference?
> Thank you again.
>
> Sincerely,
>
> Milu
>
> On Sun, Mar 25, 2018 at 10:05 PM, Bert Gunter <bgunter.4567 at gmail.com>
> wrote:
>>
>> I am sure that this sort of thing has been asked and answered before,
>> so in case my suggestions don't work for you, just search the archives
>> a bit more.
>> I am also sure that it can be handled directly by numerous functions
>> in numerous packages, e.g. via time series methods or by calculating
>> running means of suitably shifted series.
>>
>> However, as it seems to be a straightforward task, I'll provide what I
>> think is a simple solution in base R. Adjust to your situation.
>>
>> ## First I need a little utility function to offset rows. Lots of ways
>> to do this,many nicer than this I'm sure.
>>
>> > shift <- function(x,k)
>> +    ## x is a vector of values -- e.g. of a column in your df
>> + {
>> +    sapply(seq_len(k),function(i)c(rep(NA,i),head(x,-i)))
>> + }
>> >
>> >
>> > ## Testit
>> > x <- c(1,3,5,7,8:11)
>> > m <- shift(x,3) ## matrix of prior values up to lag 3
>> > m ## note rows have been omitted where lags don't exist
>>      [,1] [,2] [,3]
>> [1,]   NA   NA   NA
>> [2,]    1   NA   NA
>> [3,]    3    1   NA
>> [4,]    5    3    1
>> [5,]    7    5    3
>> [6,]    8    7    5
>> [7,]    9    8    7
>> [8,]   10    9    8
>> > rowMeans(m) ## means of previous 3
>> [1]       NA       NA       NA 3.000000 5.000000 6.666667 8.000000
>> 9.000000
>> > rowMeans(m[,1:2]) ## means of previous 2
>> [1]  NA  NA 2.0 4.0 6.0 7.5 8.5 9.5
>>
>>
>> Cheers,
>> Bert
>>
>>
>>
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Sun, Mar 25, 2018 at 7:48 AM, Miluji Sb <milujisb at gmail.com> wrote:
>> > Dear all,
>> >
>> > I have weekly data by city (variable citycode). I would like to take the
>> > average of the previous two, three, four weeks (without the current
>> > week)
>> > of the variable called value.
>> >
>> > This is what I have tried to compute the average of the two previous
>> > weeks;
>> >
>> > df = df %>%
>> >   mutate(value.lag1 = lag(value, n = 1)) %>%
>> >   mutate(value .2.previous = rollapply(data = value.lag1,
>> >                                      width = 2,
>> >                                      FUN = mean,
>> >                                      align = "right",
>> >                                      fill = NA,
>> >                                      na.rm = T))
>> >
>> > I crated the lag of the variable first and then attempted to compute the
>> > average but this does not seem to to what I want. What I am doing wrong?
>> > Any help will be appreciated. The data is below. Thank you.
>> >
>> > Sincerely,
>> >
>> > Milu
>> >
>> > dput(droplevels(head(df, 10)))
>> > structure(list(year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
>> > 1970L, 1970L, 1970L, 1970L), citycode = c(1L, 1L, 1L, 1L, 1L,
>> > 1L, 1L, 1L, 1L, 1L), month = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
>> > 2L, 3L), week = c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), date =
>> > structure(c(1L,
>> > 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), .Label = c("1970-01-10",
>> > "1970-01-17", "1970-01-24", "1970-01-31", "1970-02-07", "1970-02-14",
>> > "1970-02-21", "1970-02-28", "1970-03-07"), class = "factor"),
>> >     value = c(-15.035, -20.478, -22.245, -23.576, -8.84099999999995,
>> >     -18.497, -13.892, -18.974, -15.919, -13.576)), .Names = c("year",
>> > "citycode", "month", "week", "date", "tmin"), row.names = c(NA,
>> > 10L), class = "data.frame")
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list