[R] Merge partially duplicated rows

David Winsemius dwinsemius at comcast.net
Tue Aug 4 02:02:36 CEST 2009


On Aug 3, 2009, at 7:12 PM, David Winsemius wrote:

>
> On Aug 3, 2009, at 9:24 AM, Rnewbie wrote:
>
>>
>> Dear all,
>>
>> I have a dataset, and I wanted to merge the rows with duplicated  
>> IDs by
>> calculating the means or medians from the duplicate rows. I tried  
>> using the
>> command duplicated(x), but it only tells where the duplicated rows  
>> are.
>
> You might want to look at the ave function. It will calculate a  
> function within IDs and you can assign that as another row in the  
> datafrme before you exclude the duplicates.
                         ^^^^^^

err... I meant to say another column.

 > tst <- data.frame(ID = sample(c("1234", "4567", "2346"), 10,  
replace=TRUE), val=rnorm(10))
 > tst
      ID         val
1  2346  0.22659389
2  2346  0.46835154
3  2346 -0.53702251
4  2346 -1.00187606
5  1234  0.90843566
6  2346 -0.59654370
7  4567 -0.04355647
8  1234  0.65332120
9  4567 -2.22517105
10 1234 -0.26911187
 > tst$IDmn <- ave(tst$val, tst$ID) #default function for ave is mean  
but others can be used
 > tst
      ID         val       IDmn
1  2346  0.22659389 -0.2880994
2  2346  0.46835154 -0.2880994
3  2346 -0.53702251 -0.2880994
4  2346 -1.00187606 -0.2880994
5  1234  0.90843566  0.4308817
6  2346 -0.59654370 -0.2880994
7  4567 -0.04355647 -1.1343638
8  1234  0.65332120  0.4308817
9  4567 -2.22517105 -1.1343638
10 1234 -0.26911187  0.4308817

>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list