[R] Subsetting a data frame by a factor, using the level that occurs the most times

Liaw, Andy andy_liaw at merck.com
Thu Jan 20 17:12:29 CET 2005


> From: Douglas Bates
> 
> michael watson (IAH-C) wrote:
> > I think that title makes sense... I hope it does...
> > 
> > I have a data frame, one of the columns of which is a 
> factor.  I want
> > the rows of data that correspond to the level in that factor which
> > occurs the most times.  
> 
> So first you want to determine the mode (in the sense of the most 
> frequently occuring value) of the factor.   One way to do this is
> 
> names(which.max(table(fac)))
> 
> Use this comparison for the subset as
> 
> subset(data, pattern == names(which.max(table(pattern))))

Just be careful that if there are ties (i.e., more than one level having the
max) which.max() will randomly pick one of them.  That may or may not be
what's desired.  If that is a possibility, Mick will need to think what he
wants in such cases.

Andy

 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list