[R] median of binned values

Moshe Olshansky m_olshansky at yahoo.com
Wed Dec 19 23:22:19 CET 2007


Alternatively
levels(df$binname)[which(df$freq >=
0.5*cumsum(df$freq)[nrow(df)])[1]]

--- Chuck Cleland <ccleland at optonline.net> wrote:

> Martin Tomko wrote:
> > Dear list,
> > I have a vector (array, table row, whatever is
> best) of frequency values 
> > for categories (or bins), and I need to find the
> median category. 
> > Trivial to do by hand, but I was wondering if
> there is a means to  do it 
> > in R in an elegant way.
> > 
> > The obvious medioan(vector) returns the median
> frequency for the binns, 
> > and that is not what I want. i.e,:
> >              freq
> > cat1    1
> > cat2   10  
> > cat3   100  
> > cat4   1000
> > cat5   10000
> > 
> > I want it to return cat5, instead of cat3.
> 
> df <- data.frame(binname = as.factor(paste("cat",
> 1:5, sep="")),
>                  freq = c(1,10,100,1000,10000))
> 
> df
>   binname  freq
> 1    cat1     1
> 2    cat2    10
> 3    cat3   100
> 4    cat4  1000
> 5    cat5 10000
> 
> with(df,
> levels(binname)[median(rep(as.numeric(binname),
> freq))])
> [1] "cat5"
> 
> > Thanks a lot
> > Martin
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code. 
> 
> -- 
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>



More information about the R-help mailing list