[R] summing and combining rows

arun smartpink111 at yahoo.com
Wed Aug 8 20:20:55 CEST 2012



HI,

From the ?aggregate(),
formula: a formula, such as ‘y ~ x’ or ‘cbind(y1, y2) ~ x1 + x2’,
          where the ‘y’ variables are numeric data to be split into
          groups according to the grouping ‘x’ variables (usually
          factors).
So, I converted your data to factors for the grouping variable, the results are the same.

convert.type1 <- function(obj,types){
    for (i in 1:length(obj)){
        FUN <- switch(types[i],character = as.character, 
                                   numeric = as.numeric, 
                                   factor = as.factor)
        obj[,i] <- FUN(obj[,i])
    }
    obj
}
dat2<-convert.type1(dat1,c("factor","factor","factor","factor","factor","factor","factor","factor","numeric","factor","factor"))
str(dat2)
'data.frame':    8 obs. of  11 variables:
 $ Data     : Factor w/ 1 level "VTM": 1 1 1 1 1 1 1 1
 $ Plot     : Factor w/ 4 levels "39C16","39F11",..: 1 1 2 2 3 3 4 4
 $ Lat      : Factor w/ 4 levels "39.54522","39.56214",..: 4 4 3 3 2 2 1 1
 $ LatCat   : Factor w/ 1 level "Lat6": 1 1 1 1 1 1 1 1
 $ Elevation: Factor w/ 3 levels "500","900","1500": 3 3 1 1 3 3 2 2
 $ ElevCat  : Factor w/ 1 level "Elev1": 1 1 1 1 1 1 1 1
 $ Type     : Factor w/ 1 level "Conifer": 1 1 1 1 1 1 1 1
 $ SizeClass: Factor w/ 2 levels "Class3","Class4": 1 2 1 2 1 2 1 2
 $ Stems    : num  0 1 0 0 3 1 1 2
 $ Area     : Factor w/ 3 levels "694.0784","751.5347",..: 2 2 2 2 1 1 3 3
 $ Density  : Factor w/ 3 levels "0","13.08926",..: 1 3 1 1 1 1 2 1
#Taking out Density will group for the combinations of other factors
aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area,data=dat2,sum)
   Plot Data      Lat LatCat Elevation    Type     Area Stems
1 39F13  VTM 39.56214   Lat6      1500 Conifer 694.0784     4
2 39F11  VTM 39.57721   Lat6       500 Conifer 751.5347     0
3 39C16  VTM 39.76282   Lat6      1500 Conifer 751.5347     1
4 39F14  VTM 39.54522   Lat6       900 Conifer  763.985     3
#but, it won't go lower than this as there are four levels for Plot and Lat, unless you drop those

 aggregate(Stems~Data+LatCat+Elevation+Type,data=dat2,sum)
  Data LatCat Elevation    Type Stems
1  VTM   Lat6       500 Conifer     0
2  VTM   Lat6       900 Conifer     3
3  VTM   Lat6      1500 Conifer     5

A.K.






----- Original Message -----
From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Wednesday, August 8, 2012 2:00 PM
Subject: Re: [R] summing and combining rows

OK. I can make this work. Thank you for helping me figure this out.

On 8/8/2012 10:49 AM, arun wrote:
> Hello,
>
> I tried with ddply
>
>   ddply(dat1,.(Data,Plot,Lat,LatCat,Elevation,Type,Area,Density),summarize,sum(Stems))
>    Data  Plot      Lat LatCat Elevation    Type     Area  Density ..1
> 1  VTM 39C16 39.76282   Lat6      1500 Conifer 751.5347  0.00000   0
> 2  VTM 39C16 39.76282   Lat6      1500 Conifer 751.5347 13.30611   1
> 3  VTM 39F11 39.57721   Lat6       500 Conifer 751.5347  0.00000   0
> 4  VTM 39F13 39.56214   Lat6      1500 Conifer 694.0784  0.00000   4
> 5  VTM 39F14 39.54522   Lat6       900 Conifer 763.9850  0.00000   2
> 6  VTM 39F14 39.54522   Lat6       900 Conifer 763.9850 13.08926   1
>
>
> Results look same as in aggregate.
> Suppose, if you take out density,
>
> ddply(dat1,.(Data,Plot,Lat,LatCat,Elevation,Type,Area),summarize,sum(Stems))
>    Data  Plot      Lat LatCat Elevation    Type     Area ..1
> 1  VTM 39C16 39.76282   Lat6      1500 Conifer 751.5347   1
> 2  VTM 39F11 39.57721   Lat6       500 Conifer 751.5347   0
> 3  VTM 39F13 39.56214   Lat6      1500 Conifer 694.0784   4
> 4  VTM 39F14 39.54522   Lat6       900 Conifer 763.9850   3
>
> I guess now it is summed.
>
>
>
> A.K.
>
>
>
>
>
>
> ----- Original Message -----
> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
> To: arun <smartpink111 at yahoo.com>
> Cc:
> Sent: Wednesday, August 8, 2012 1:19 PM
> Subject: Re: [R] summing and combining rows
>
> ok, so it looks like aggregate lists them separately unless everything
> in the 2 rows matches. Below, we have 2 plots where the density is
> different in Class3 than Class4, and these are not summed. Is that your
> understanding?
>
> Thanks for your help.
>
> Chris
>
> On 8/7/2012 4:18 PM, arun wrote:
>> HI,
>>
>> I tried two ways in aggregate.  The results are the same.
>> dat1<-read.table(text="
>>       Data           Plot      Lat LatCat Elevation ElevCat    Type SizeClass Stems     Area   Density
>>       VTM          39C16 39.76282   Lat6      1500   Elev1 Conifer    Class3     0 751.5347   0.00000
>>       VTM          39C16 39.76282   Lat6      1500   Elev1 Conifer    Class4     1 751.5347  13.30611
>>       VTM          39F11 39.57721   Lat6       500   Elev1 Conifer    Class3     0 751.5347   0.00000
>>       VTM          39F11 39.57721   Lat6       500   Elev1 Conifer    Class4     0 751.5347   0.00000
>>       VTM          39F13 39.56214   Lat6      1500   Elev1 Conifer    Class3     3 694.0784   0.00000
>>       VTM          39F13 39.56214   Lat6      1500   Elev1 Conifer    Class4     1 694.0784   0.00000
>>       VTM          39F14 39.54522   Lat6       900   Elev1 Conifer    Class3     1 763.9850  13.08926
>>       VTM          39F14 39.54522   Lat6       900   Elev1 Conifer    Class4     2 763.9850   0.00000
>> ",sep="",header=TRUE, stringsAsFactors=FALSE)
>>
>>
>>> with(dat1,aggregate(Stems,list(Plot,Data,Lat,LatCat,Elevation,Type,Area,Density),sum))
>>      Group.1 Group.2  Group.3 Group.4 Group.5 Group.6  Group.7  Group.8 x
>> 1   39F13     VTM 39.56214    Lat6    1500 Conifer 694.0784  0.00000 4
>> 2   39F11     VTM 39.57721    Lat6     500 Conifer 751.5347  0.00000 0
>> 3   39C16     VTM 39.76282    Lat6    1500 Conifer 751.5347  0.00000 0
>> 4   39F14     VTM 39.54522    Lat6     900 Conifer 763.9850  0.00000 2
>> 5   39F14     VTM 39.54522    Lat6     900 Conifer 763.9850 13.08926 1
>> 6   39C16     VTM 39.76282    Lat6    1500 Conifer 751.5347 13.30611 1
>>> aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area+Density,data=dat1,sum)
>>       Plot Data      Lat LatCat Elevation    Type     Area  Density Stems
>> 1 39F13  VTM 39.56214   Lat6      1500 Conifer 694.0784  0.00000     4
>> 2 39F11  VTM 39.57721   Lat6       500 Conifer 751.5347  0.00000     0
>> 3 39C16  VTM 39.76282   Lat6      1500 Conifer 751.5347  0.00000     0
>> 4 39F14  VTM 39.54522   Lat6       900 Conifer 763.9850  0.00000     2
>> 5 39F14  VTM 39.54522   Lat6       900 Conifer 763.9850 13.08926     1
>> 6 39C16  VTM 39.76282   Lat6      1500 Conifer 751.5347 13.30611     1
>>
>>
>>
>> The rows with 39.57721 and 39.56214 are the same for SizeClass except the Stems #.  It got summed.  Otherwise, it is giving both Class3 and Class4 values separately.
>>
>> A.K.
>>
>>
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>> To: arun <smartpink111 at yahoo.com>
>> Cc:
>> Sent: Tuesday, August 7, 2012 6:38 PM
>> Subject: Re: [R] summing and combining rows
>>
>> Hmmm. It looks like it's only giving me the values for Class3, instead
>> of summing, which is why I thought the "+" method might not be the
>> appropriate coding.
>>
>> Here's the code I used:
>>
>>> CH_Con_Elev1SC34a<-
>> aggregate(Stems~Plot+Data+Lat+LatCat+Elevation+Type+Area+Density,
>> data=CH_Con_Elev1SC34, sum)
>>> CH_Con_Elev1SC34b<- data.frame(CH_Con_Elev1SC34a,
>> SizeClass=rep("Class34",))
>>
>> If it helps, attached is a txt file with the data structure.
>>
>> On 8/7/2012 3:00 PM, arun wrote:
>>> Hi,
>>> Not sure why you mentioned "+" doesn't work.
>>> dat1<-read.table(text="
>>> Plot        Elevation        Area        SizeClass    Stems
>>> 12            1200            132.4        Class3            0
>>> 12            1200            132.4        Class4            1
>>> 17            2320            209.1        Class3            3
>>> 17            2320            209.1        Class4            5
>>> ",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>
>>> dat2<-aggregate(Stems~Plot+Elevation+Area, data=dat1,sum)
>>>       dat3<-data.frame(dat2,SizeClass=rep("Class34",2))
>>>       dat3<-dat3[,c(1:3,5,4)]
>>>       dat3
>>> #  Plot Elevation  Area SizeClass Stems
>>> #1   12      1200 132.4   Class34     1
>>> #2   17      2320 209.1   Class34     8
>>>
>>> A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>>> To: arun <smartpink111 at yahoo.com>
>>> Cc:
>>> Sent: Tuesday, August 7, 2012 5:47 PM
>>> Subject: Re: [R] summing and combining rows
>>>
>>> Thanks for your response. The aggregate method mostly works for me, but
>>> I have numerous other columns that I'd like to keep in the result. So,
>>> if I have something like this:
>>>
>>>
>>> Plot        Elevation        Area        SizeClass    Stems
>>> 12            1200            132.4        Class3            0
>>> 12            1200            132.4        Class4            1
>>> 17            2320            209.1        Class3            3
>>> 17            2320            209.1        Class4            5
>>>
>>> How can I make it look like this?
>>>
>>> Plot        Elevation        Area        SizeClass    Stems
>>> 12            1200            132.4        Class34         1
>>> 17            2320            209.1        Class34         8
>>>
>>> I see something in ?aggregate about adding columns with a +, but this
>>> doesn't quite work for me.
>>>
>>>
>>> On 8/7/2012 2:32 PM, arun wrote:
>>>> Hi,
>>>>
>>>> Try this:
>>>> dat1<-read.table(text="
>>>> Plot    SizeClass    Stems
>>>> 12      Class3            1
>>>> 12      Class4            3
>>>> 17      Class3            5
>>>> 17      Class4            2
>>>> ",sep="",header=TRUE, stringsAsFactors=FALSE)
>>>>
>>>>
>>>>
>>>> ddply(dat1,.(Plot), summarize, sum(Stems))
>>>>
>>>> #or
>>>>
>>>>
>>>> dat2<-aggregate(Stems~Plot,data=dat1,sum)
>>>>         dat3<-data.frame(dat2,SizeClass=rep("Class34",2))
>>>>         dat3
>>>> #  Plot Stems SizeClass
>>>> #1   12     4   Class34
>>>> #2   17     7   Class34
>>>>
>>>>
>>>> A.K.
>>>>
>>>> ----- Original Message -----
>>>> From: Christopher R. Dolanc <crdolanc at ucdavis.edu>
>>>> To: r-help at r-project.org
>>>> Cc:
>>>> Sent: Tuesday, August 7, 2012 1:47 PM
>>>> Subject: [R] summing and combining rows
>>>>
>>>> Hello,
>>>>
>>>> I have a data set that needs to be combined so that rows are summed by a group based on a certain variable. I'm pretty sure rowsum() or rowsums() can do this but it's difficult for me to figure out how it will work for my data based on the examples I've read.
>>>>
>>>> My data are structured like this:
>>>>
>>>> Plot    SizeClass    Stems
>>>> 12       Class3            1
>>>> 12       Class4            3
>>>> 17       Class3            5
>>>> 17       Class4            2
>>>>
>>>> I simply want to sum the size classes by plot and create a new data frame with a size class called "Class34" or with the SizeClass variable removed. I actually do have other size classes that I want to leave alone, but combine 3 and 4, so if I could figure out how to do this by creating a new class, that would be preferable.
>>>>
>>>> I've also attached a more detailed sample of data.
>>>>
>>>> Thanks,
>>>> Chris Dolanc
>>>>
>>>> -- Christopher R. Dolanc
>>>> Post-doctoral Researcher
>>>> University of Montana and UC-Davis
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>

-- 
Christopher R. Dolanc
Post-doctoral Researcher
University of Montana and UC-Davis



More information about the R-help mailing list