[R] replicating the odds ratio from a published study

Michael Dewey info at aghmed.fsnet.co.uk
Mon Jan 29 10:44:08 CET 2007

At 21:13 28/01/2007, Bob Green wrote:
>Thanks. Yes, clearly the volume number for the Schanda paper I cited is wrong.
>Where things are a bit perplexing, is that I used the same method as 
>Peter suggested on two papers by Eronen (referenced below). I can 
>reproduce in R a similar odds ratio to the first published paper e.g 
>OR = 9.7 (CI= 7.4-12.6) whereas I obtained quite different results 
>from the second published paper (Eronen 2) of OR =  10.0 (8.1-12.5). 
>One reason why I wanted to work out the calculations was so I could 
>analyse data from studies using the same method, for confirmation.
>Now the additional issue, is that Woodward, who is also the author 
>of an epidemiological text, says in a review that Eronen used 
>wrong  formula in a 1995 paper and indicates that this comment 
>applies also to later studies - he stated the "they use methods 
>designed for use with binomial data when they really have Poisson 
>data. Consequently, they quote odds ratios when they really have 
>relative rates and their confidence intervals are 
>inaccurate".  Eronen1 cites the formula that was used for OR. 
>Schanda sets out his table for odds ratio the same as Eronen1

There do seem to be difficulties in what they are doing as they have 
not observed all the non-homicides, they estimate how many they are 
and then estimate the number of people with a given diagnosis using 
prevalence estimates from another study. I think you are moving 
towards writing an article criticising the statistical methods used 
in this whole field which I think is going beyond the resources of R-help.

>For the present purpose, my primary question is: as you have now 
>seen the Schanda paper, would you consider Schanda calculated odds 
>or relative risk?
>Also, when I tried the formula suggested by Peter (below) I obtained 
>an error - do you know what M might be or the source of the error?
>Error in sum(1/M) : object "M" not found
> > eronen1 <-  as.table(matrix(c(58,852,13600-58,1947000-13600-852), 
> ncol = 2 , dimnames = list(group=c("scz", "nonscz"), who= 
> c("sample", "population"))))
> > fisher.test(eronen1)
>p-value < 2.2e-16
>alternative hypothesis: true odds ratio is not equal to 1
>95 percent confidence interval:
>   7.309717 12.690087
>sample estimates:
>odds ratio
>   9.713458
> > eronen2 
> <-  as.table(matrix(c(86,1302,13530-86,1933000-13530-1302), ncol = 
> 2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", "population"))))
> > fisher.test(eronen2)
>p-value < 2.2e-16
>alternative hypothesis: true odds ratio is not equal to 1
>95 percent confidence interval:
>   7.481272 11.734136
>sample estimates:
>odds ratio
>    9.42561
>Eronen, M. et al. (1996 - 1) Mental disorders and homicidal behavior 
>in Finland. Archives of General Psychiatry, 53, 497-501
>Eronen, M et al (1996 - 2). Schizophrenia & homicidal 
>behavior.  Schizophrenia Bulletin, 22, 83-89
>Woodward, Mental disorder & homicide. Epidemiologia E Psichiatria 
>Sociale, 9, 171-189
>Any comments are welcomed,
>At 01:57 PM 28/01/2007 +0000, Michael Dewey wrote:
>>At 22:01 26/01/2007, Peter Dalgaard wrote:
>>>Bob Green wrote:
>>>>Peetr & Michael,
>>>>I now see my description may have confused the issue.  I do want 
>>>>to compare odds ratios across studies - in the sense that I want 
>>>>to create a table with the respective odds ratio for each study. 
>>>>I do not need to statistically test two sets of odds ratios.
>>>>What I want to do is ensure the method I use to compute an odds 
>>>>ratio is accurate and intended to check my method against published sources.
>>>>The paper I selected by Schanda et al (2004). Homicide and major 
>>>>mental disorders. Acta Psychiatr Scand, 11:98-107 reports a total 
>>>>sample of 1087. Odds ratios are reported separately for men and 
>>>>women. There were 961 men all of whom were convicted of homicide. 
>>>>Of these 961 men, 41 were diagnosed with schizophrenia. The 
>>>>unadjusted odds ratio is for this  group of 41 is cited as 
>>>>6.52   (4.70-9.00).  They also report the general population aged 
>>>>over 15 with schizophrenia =20,109 and the total population =2,957,239.
>>Looking at the paper (which is in volume 110 by the way) suggests 
>>that Peter's reading of the situation is correct and that is what 
>>the authors have done.
>>>>Any further clarification is much appreciated,
>>>A fisher.test on the following matrix seems about right:
>>> > matrix(c(41,920,20109-41,2957239-20109-920),2)
>>>     [,1]    [,2]
>>>[1,]   41   20068
>>>[2,]  920 2936210
>>> > fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2))
>>>        Fisher's Exact Test for Count Data
>>>data:  matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2)
>>>p-value < 2.2e-16
>>>alternative hypothesis: true odds ratio is not equal to 1
>>>95 percent confidence interval:
>>>4.645663 8.918425
>>>sample estimates:
>>>odds ratio
>>>  6.520379
>>>The c.i. is not precisely the same as your source. This could be 
>>>down to a different approximation (R's is based on the noncentral 
>>>hypergeometric distribution), but the classical asymptotic formula gives
>>> > exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M)))
>>>[1] 4.767384 8.918216
>>>which is closer, but still a bit narrower.
>>Michael Dewey

Michael Dewey

More information about the R-help mailing list