[R] two-sample test of multinomial proportion

Gustaf Rydevik gustaf.rydevik at gmail.com
Tue Jan 20 16:41:51 CET 2009


On Tue, Jan 20, 2009 at 4:08 PM, Gustaf Rydevik
<gustaf.rydevik at gmail.com> wrote:
> Hi all,
>
> This is perhaps more a statistics question than an R question, but I
> hope it's OK anyhow.
>
> I have some data (see below) with the number of tests positive to
> subtype H1 of a virus, the number of tests postive to subtype H3, and
> the total number of tests. This is for two different groups, and the
> two subtypes are mutually exclusive.
>
> What is the best way to test if the proportion of H1 tests to all
> positive tests differ between the two groups?
> I could run prop.test() on just the H1 and H3 part of the data,
> ignoring the total number of tests. But this seem to skip some
> information regarding variance of H1/H3 in the two groups, so I don't
> think it is correct.
>
> I've tried using a bootstrap approach on the ratio of the two
> proportions, but there must be a smarter way.
> Any help is much appreciated!
>
> Best regards,
>
> Gustaf Rydevik
>
>
> ####data and bootstrap attempt ###
> multi.data<-data.frame(
>                  group=c("a","b"),
>                  H1=c(2,12),
>                  H3=c(21,46),
>                  tests=c(189,411)
> )
> multi.ind<-data.frame(Type=
> rep(c("H1","H3","Neg"),c(2+12,21+46,189+411-2-12-21-46)),
> group=rep(c("a","b","a","b","a","b"),c(2,12,21,46,189-2-21,411-12-46))
> )
>
> props1<-vector(mode="numeric",length=1000)
> props2<-vector(mode="numeric",length=1000)
> for(i in 1:1000){
> sub.tab<-t(table(Subtyp.orig[sample(1:nrow(Subtyp.orig),nrow(Subtyp.orig),replace=TRUE),]))
> props1[i]<-sub.tab[1,1]/(sub.tab[1,1]+sub.tab[1,2])
> props2[i]<-sub.tab[2,1]/(sub.tab[2,1]+sub.tab[2,2])
> }
> sub.kvot<-props1/props2
> sort(sub.kvot)[50]
> sort(sub.kvot)[950]
>
>
>
> --
> Gustaf Rydevik, M.Sci.
> tel: +46(0)703 051 451
> address:Essingetorget 40,112 66 Stockholm, SE
> skype:gustaf_rydevik
>

ooops - forgot to change a name of the bootstrap code. Below is a
corrected version.

/Gustaf

####data and bootstrap attempt ###
multi.data<-data.frame(
                 group=c("a","b"),
                 H1=c(2,12),
                 H3=c(21,46),
                 tests=c(189,411)
)
multi.ind<-data.frame(Type=
rep(c("H1","H3","Neg"),c(2+12,21+46,189+411-2-12-21-46)),
group=rep(c("a","b","a","b","a","b"),c(2,12,21,46,189-2-21,411-12-46))
)

props1<-vector(mode="numeric",length=1000)
props2<-vector(mode="numeric",length=1000)
for(i in 1:1000){
sub.tab<-t(table(multi.ind[sample(1:nrow(multi.ind),nrow(multi.ind),replace=TRUE),]))
props1[i]<-sub.tab[1,1]/(sub.tab[1,1]+sub.tab[1,2])
props2[i]<-sub.tab[2,1]/(sub.tab[2,1]+sub.tab[2,2])
}
sub.kvot<-props1/props2
sort(sub.kvot)[50]
sort(sub.kvot)[950]




More information about the R-help mailing list