[R] Subsetting a data frame by a factor, using the level that occurs the most times

Douglas Bates bates at stat.wisc.edu
Thu Jan 20 18:16:50 CET 2005


Liaw, Andy wrote:
>>From: Douglas Bates
>>
>>michael watson (IAH-C) wrote:
>>
>>>I think that title makes sense... I hope it does...
>>>
>>>I have a data frame, one of the columns of which is a 
>>
>>factor.  I want
>>
>>>the rows of data that correspond to the level in that factor which
>>>occurs the most times.  
>>
>>So first you want to determine the mode (in the sense of the most 
>>frequently occuring value) of the factor.   One way to do this is
>>
>>names(which.max(table(fac)))
>>
>>Use this comparison for the subset as
>>
>>subset(data, pattern == names(which.max(table(pattern))))
> 
> 
> Just be careful that if there are ties (i.e., more than one level having the
> max) which.max() will randomly pick one of them.  That may or may not be
> what's desired.  If that is a possibility, Mick will need to think what he
> wants in such cases.

According to the documentation it picks the first one.  Also, that's 
what Martin Maechler told me and he wrote the code so I trust him on 
that.  I figure that if you have to trust someone to be meticulous and 
precise then a German-speaking Swiss is a good choice.




More information about the R-help mailing list