[R] median of binned values

Chuck Cleland ccleland at optonline.net
Wed Dec 19 12:47:59 CET 2007


Martin Tomko wrote:
> Thank you, Chuck,
> would you mind commenting a bit on the code, it is not all clear... HOw 
> would you go to retrieve only the numeric value (not the category name)?
> I am just starting with R, and the functionality of replicate and levels 
> is not quite clear. I tried the documentation, but am not any wiser. 
> What if I had a vector v <- vector(c(1,10,100,1000,10000)) and wanted to 
> perform it on that?
> 
> Thanks a lot
> Martin

  Retrieve the numeric value rather than the category name as follows:

with(df, freq[median(rep(as.numeric(binname), freq))])
[1] 10000

  To do essentially the same thing with a vector:

myvec <- c(1,10,100,1000,10000)

myvec[median(rep(1:length(myvec), myvec))]
[1] 10000

  I'm sure I cannot explain levels() and rep() any better than the help
pages for those functions.

> Chuck Cleland wrote:
>> Martin Tomko wrote:
>>> Dear list,
>>> I have a vector (array, table row, whatever is best) of frequency values 
>>> for categories (or bins), and I need to find the median category. 
>>> Trivial to do by hand, but I was wondering if there is a means to  do it 
>>> in R in an elegant way.
>>>
>>> The obvious medioan(vector) returns the median frequency for the binns, 
>>> and that is not what I want. i.e,:
>>>              freq
>>> cat1    1
>>> cat2   10  
>>> cat3   100  
>>> cat4   1000
>>> cat5   10000
>>>
>>> I want it to return cat5, instead of cat3.
>> df <- data.frame(binname = as.factor(paste("cat", 1:5, sep="")),
>>                  freq = c(1,10,100,1000,10000))
>>
>> df
>>   binname  freq
>> 1    cat1     1
>> 2    cat2    10
>> 3    cat3   100
>> 4    cat4  1000
>> 5    cat5 10000
>>
>> with(df, levels(binname)[median(rep(as.numeric(binname), freq))])
>> [1] "cat5"
>>
>>> Thanks a lot
>>> Martin
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.  

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list