[R] cumulative sum of within levels of a dataframe
Gavin Simpson
gavin.simpson at ucl.ac.uk
Fri Jun 27 23:09:35 CEST 2008
On Fri, 2008-06-27 at 16:52 -0400, Levi Waldron wrote:
> This one should be easy but it's giving me a hard time mostly because tapply
> puts the results in a list. I want to calculate the cumulative sum of a
> variable in a dataframe, but with the accumulation only within each level of
> a factor. For a very simple example, take:
> df$willdo <- unlist(tapply(df$x, df$fac, cumsum))
> df$ideal <- df$willdo - df$x
> df
x fac willdo ideal
1 1 a 1 0
2 1 a 2 1
3 1 a 3 2
4 1 a 4 3
5 1 a 5 4
6 2 b 2 0
7 2 b 4 2
8 2 b 6 4
9 2 b 8 6
10 2 b 10 8
11 3 c 3 0
12 3 c 6 3
13 3 c 9 6
14 3 c 12 9
15 3 c 15 12
HTH
G
>
> > df <-
> data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3]))
> > df
>
> I'd like to create another column in the dataframe so it looks like this,
> and make sure that the cumulative sums still match the right levels of the
> factor. I've included a "willdo" column that's just a cumulative sum, and
> an "ideal" column that's the cumulative sum minus the current value - the
> column headings are self explanatory.
>
> > answer
>
>
