[R] replicating the odds ratio from a published study

Bob Green bgreen at dyson.brisnet.org.au
Sun Jan 28 22:13:46 CET 2007


Thanks. Yes, clearly the volume number for the Schanda paper I cited is wrong.

Where things are a bit perplexing, is that I used the same method as Peter 
suggested on two papers by Eronen (referenced below). I can reproduce in R 
a similar odds ratio to the first published paper e.g OR = 9.7 (CI= 
7.4-12.6) whereas I obtained quite different results from the second 
published paper (Eronen 2) of OR =  10.0 (8.1-12.5). One reason why I 
wanted to work out the calculations was so I could analyse data from 
studies using the same method, for confirmation.

Now the additional issue, is that Woodward, who is also the author of an 
epidemiological text, says in a review that Eronen used wrong  formula in a 
1995 paper and indicates that this comment applies also to later studies - 
he stated the "they use methods designed for use with binomial data when 
they really have Poisson data. Consequently, they quote odds ratios when 
they really have relative rates and their confidence intervals are 
inaccurate".  Eronen1 cites the formula that was used for OR. Schanda sets 
out his table for odds ratio the same as Eronen1

For the present purpose, my primary question is: as you have now seen the 
Schanda paper, would you consider Schanda calculated odds or relative risk?

Also, when I tried the formula suggested by Peter (below) I obtained an 
error - do you know what M might be or the source of the error?

Error in sum(1/M) : object "M" not found

 > eronen1 <-  as.table(matrix(c(58,852,13600-58,1947000-13600-852), ncol = 
2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", "population"))))
 > fisher.test(eronen1)

p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
   7.309717 12.690087
sample estimates:
odds ratio

 > eronen2 <-  as.table(matrix(c(86,1302,13530-86,1933000-13530-1302), ncol 
= 2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", 
 > fisher.test(eronen2)

p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
   7.481272 11.734136
sample estimates:
odds ratio


Eronen, M. et al. (1996 - 1) Mental disorders and homicidal behavior in 
Finland. Archives of General Psychiatry, 53, 497-501

Eronen, M et al (1996 - 2). Schizophrenia & homicidal 
behavior.  Schizophrenia Bulletin, 22, 83-89

Woodward, Mental disorder & homicide. Epidemiologia E Psichiatria Sociale, 
9, 171-189

Any comments are welcomed,


At 01:57 PM 28/01/2007 +0000, Michael Dewey wrote:
>At 22:01 26/01/2007, Peter Dalgaard wrote:
>>Bob Green wrote:
>>>Peetr & Michael,
>>>I now see my description may have confused the issue.  I do want to 
>>>compare odds ratios across studies - in the sense that I want to create 
>>>a table with the respective odds ratio for each study. I do not need to 
>>>statistically test two sets of odds ratios.
>>>What I want to do is ensure the method I use to compute an odds ratio is 
>>>accurate and intended to check my method against published sources.
>>>The paper I selected by Schanda et al (2004). Homicide and major mental 
>>>disorders. Acta Psychiatr Scand, 11:98-107 reports a total sample of 
>>>1087. Odds ratios are reported separately for men and women. There were 
>>>961 men all of whom were convicted of homicide. Of these 961 men, 41 
>>>were diagnosed with schizophrenia. The unadjusted odds ratio is for 
>>>this  group of 41 is cited as 6.52   (4.70-9.00).  They also report the 
>>>general population aged over 15 with schizophrenia =20,109 and the total 
>>>population =2,957,239.
>Looking at the paper (which is in volume 110 by the way) suggests that 
>Peter's reading of the situation is correct and that is what the authors 
>have done.
>>>Any further clarification is much appreciated,
>>A fisher.test on the following matrix seems about right:
>> > matrix(c(41,920,20109-41,2957239-20109-920),2)
>>     [,1]    [,2]
>>[1,]   41   20068
>>[2,]  920 2936210
>> > fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2))
>>        Fisher's Exact Test for Count Data
>>data:  matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2)
>>p-value < 2.2e-16
>>alternative hypothesis: true odds ratio is not equal to 1
>>95 percent confidence interval:
>>4.645663 8.918425
>>sample estimates:
>>odds ratio
>>  6.520379
>>The c.i. is not precisely the same as your source. This could be down to 
>>a different approximation (R's is based on the noncentral hypergeometric 
>>distribution), but the classical asymptotic formula gives
>> > exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M)))
>>[1] 4.767384 8.918216
>>which is closer, but still a bit narrower.
>Michael Dewey

More information about the R-help mailing list