[R] Calculate median from counts and values

Gabor Grothendieck ggrothendieck at gmail.com
Wed May 4 03:23:14 CEST 2005


On 5/3/05, David Finlayson <david.p.finlayson at gmail.com> wrote:
> I am tangled with a syntax question. I want to calculate basic statistics
> for a large dataset provided in weights and values and I can't figure out
> an elegant way to expand the data.
> 
> For example here are the counts:
> 
> > counts
>    n4 n3 n2 n1 p0 p1 p2 p3  p4
> 1   0  0  0  1  1  3 16 55  24
> 2   0  0  0  0  2  8 28 47  15
> 3   1 17 17 13  4  5 12 24   8
> ...
> 
> and the values:
> 
> > values
>      n4 n3 n2 n1 p0  p1   p2    p3     p4
> [1,] 16  8  4  2  1 0.5 0.25 0.125 0.0625
> 
> What I want for each row is something like this (shown for row 1):
> 
> c( rep(16, 0), rep(8, 0), rep(4, 0), rep(2, 1), rep(1, 1), rep(0.5, 3),
> rep(0.25, 16), rep(0.125, 55), rep(0.0625, 24))
> 
> I am sure that this is a one-liner for an R-master, but I can't figure it
> out without a set of nested for loops iterating over each row in counts.
> 

Is there supposed to be one row of values that apply to all
rows of counts or is there to be different rows of values for
different rows of counts?  Also in your example row 3 has
a different total than 1 or 2.  Is that right?

At any rate, I will assume that there is only one row of 
values and many rows of counts and that its not necessarily
true that counts sum to the same number in each row.
Then noting that  c(rep(4,1), rep(5,2), rep(6,3)) is the same
as rep(4:6, 1:3) is the same as, we have:

lapply(as.data.frame(t(counts)), rep, x = unlist(values))




More information about the R-help mailing list