[R] Binning question (binning rows of a data.frame according to a variable)

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Mon Mar 20 19:35:47 CET 2006


[[ Please ignore the last email which was sent incomplete ]]

Lets say there are 10 students in the first group and denote x1 as (say)
the number of red balls for student 1 and s1 the total balls. Then I was
calculating the average the proportion ( x1/s1 + x2/s2 + ... + x10/s10 )
and you were calculating the average number of events (x1+x2
+...+x10)/(s1+s2+...+s10).

It is just by chance that your calculation and mine agrees. When the
numbers are highly unbalanced, you may get very different results.



On second thoughts I think it is much better to calculate the a weighted
average of the proportions. The weights should reflect the variance of
the estimate of the proportions. Assuming that your outcome of interest
is proportions, the summary effect size might look something like 
 
  p_hat = ( w1*p1 + w2*p2+ ... + w10*p10 ) 
 
  where p1 = x1/s1 and w1=1/var(p1).

You should be able to obtain the standard errors for this estimate. 
Using this you can build a confidence interval and see if they overlap 
with proportion of reds in other groups. 



There is a big field called meta-analysis that deals with this kind of 
issue. You might want to read up more about this area. However I am not 
too familiar with the meta-analysis of proportion

Perhaps someone on the mailing list can advise you if this approach is
appropriate for your situation and perhaps even some references.


Regards, Adai

<SNIP>




More information about the R-help mailing list