[R] How to get the most frequent value of the subgroup

David Winsemius dwinsemius at comcast.net
Fri Mar 30 17:39:43 CEST 2012


On Mar 30, 2012, at 3:38 AM, Milan Bouchet-Valat wrote:

> Le jeudi 29 mars 2012 à 09:49 -0500, Yongsuhk Jung a écrit :
>> Dear Members of the R-Help,
>>
>>
>>
>> While using a R function - 'aggregate' that you developed, I become  
>> to have
>> a question.
>>
>> In that function,
>>
>>
>>
>>> aggregate(x, by, FUN, ..., simplify = TRUE)
>>
>>
>>
>> I was wondering about what type of FUN I should write if I want to  
>> get "the
>> most frequent value of the subgroup" as a summary statistics of the
>> subgroups.
>>
>> I will appreciate if I can get your idea on this issue.
> It would have been better if you had provided a sample data as asked  
> by
> the posting guide.

How TRUE.

>
> Anyway, here's a possibility:
>> df <- data.frame(a=rep(1:3, 2), b=c(1, 2, 2, 1, 1, 2))
>> df
>  a b
> 1 1 1
> 2 2 2
> 3 3 2
> 4 1 1
> 5 2 1
> 6 3 2
>> aggregate(df$a, list(df$b), function(x) max(table(x)))
>  Group.1 x
> 1       1 2
> 2       2 2

Prompted by the obvious error in that solution (since the mode of b==1  
is 1 and the mode of b==2 is 3) I thought I would take my untested  
code strategy and fix it as well, now that an example was "on the  
table" for discussion:

 > aggregate(df1[1], by=df1[2], FUN=function(x){  tbl <- table(x);
                         return( dimnames(tbl)[[1]][ which.max(tbl)] )
                                               } )
   b a
1 1 1
2 2 3

( The modal values are in the "a" column.)


-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list