[R] testing independence of categorical variables

David Winsemius dwinsemius at comcast.net
Fri Nov 23 02:47:08 CET 2007


"Shoaaib Mehmood" <shoaaib at gmail.com> wrote in 
news:ab02bb240711220316q25e0bbd6rd2de31610c245422 at mail.gmail.com:

> hi,
> 
> is there a way of calculating of measuring dependence between two
> categorical variables. i tried using the chi square test to test for
> independence but i got error saying that the lengths of the two
> vectors don't match. Suppose X and Y are two factors. X has 5 levels
> and Y has 7 levels. This is what i tried doing
> 
>>temp<-chisq.test(x,y)
> 
> but got error "the lengths of the two vectors don't match". any help
> will be appreciated

If you posted the table, it might be more clear why the error was being 
thrown. In the example shown you have mixed "x" and "X". They would be 
different in R.

chisq.test should not be having a problem with unequal row and column 
lengths.

#simulate a 5 x 7 table
> TT<-r2dtable(1,5*c(1,8,5,8,4),5*c(3,3,3,3,4,4,6))
> TT
[[1]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    0    1    1    0    2    1    0
[2,]    3    3    6    6    2    8   12
[3,]    1    2    3    3    9    2    5
[4,]    8    3    3    3    6    7   10
[5,]    3    6    2    3    1    2    3
#general test for association
> chisq.test(TT[[1]],TT[[2]])

        Pearson's Chi-squared test

data:  TT[[1]] 
X-squared = 33.5942, df = 24, p-value = 0.09214

Warning message:
In chisq.test(TT[[1]], TT[[2]]) :
  Chi-squared approximation may be incorrect

-- 
David Winsemius



More information about the R-help mailing list