[R] How to delete the replicate rows by summing up the numeric columns

David Winsemius dwinsemius at comcast.net
Tue Jun 29 23:58:48 CEST 2010


On Jun 29, 2010, at 3:05 PM, Yi wrote:

> Hi, folks,
>
> I am sorry that I did not state the problem correctly yesterday.
>
> Please let me address the problem by the following codes:
>
> first=c('u','b','e','k','j','c','u','f','c','e')
> second
> =
> c
> ('usa
> ','Brazil
> ','England','Korea','Japan','China','usa','France','China','England')
> third=1:10
> data=data.frame(first,second,third)
>
> ## You may understand values in the first column are the unique  
> codes for
> those in the second column.
> ####So 'u' is only for usa. Replicate values appear the same rows  
> for the
> first and second columns.
> ### Now I want to delete replicate rows with the same values in first
> (sceond) rows
> ####and sum up values in the third column for the same values.
>
> mm=melt(data,id='first')
> sum=cast(mm,first~variable,sum) ### This does not work.
>
> ###I tried another way to do this
> mm= melt(data, id='first',measure='third')
> sum=cast(mm,first~variable,sum)
>
> ## But then the problem is how to 'merge' the result with the second  
> column
> in the dataset.

 > data$summed <- ave(data$third, data$first, FUN=sum)
#computed sums within groups defined by "first"
 > data[!duplicated(data$first), c("first", "second", "summed")]
#remove duplicates and leave out "third"

   first  second summed
1     u     usa      8
2     b  Brazil      2
3     e England     13
4     k   Korea      4
5     j   Japan      5
6     c   China     15
8     f  France      8

>
>
> The expected dataframe is like this:
>
> (I showed a wrong expected dataframe yesterday.)
>
>     first   second  third
> 1      u     usa      8
> 2      b   Brazil     2
> 3      e  England   13
> 4      k   Korea     4
> 5      j   Japan      5
> 6      c   China     15
> 8      f  France     8
>
> Thanks in advance.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list