[R] how to correlate nominal variables?

Daniel Malter daniel at umd.edu
Tue Aug 25 18:41:16 CEST 2009


I updated the previously posted function for Cramer's V so that it
automatically prints Cramer's V, chi-square, the degrees of freedom, and the
significance level of Cramer's V based on the chi-square value and the
degrees of freedom with desired (user-supplied) levels of precision. An
example is included.

cramers.v=function(x,digits){
    x=as.data.frame(x)
    chisq=0
    v=NULL
    row.sum=NULL
    col.sum=NULL
    row.sum=rowSums(table(x))
    col.sum=colSums(table(x))
    for(k in 1:dim(table(x))[1]){
      for(l in 1:dim(table(x))[2]){
         
chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
          v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
          dfs=(dim(table(x))[1]-1)*(dim(table(x))[2]-1)
          sig=1-pchisq(chisq,dfs)
          }
      }
   
result=data.frame(round(v,digits[1]),round(chisq,digits[2]),round(dfs,digits[3]),round(sig,digits[4]))
    names(result)=c("Cramer's V","Chi-square","DFs","p-value")
    print(result)
  }

##Example
#Create correlated a and b
a=rnorm(100)
e=rnorm(100)
b=a+e

#Split a and b into quartile categories
a=cut(a,breaks=quantile(a),include.lowest=TRUE,labels=FALSE)
b=cut(b,breaks=quantile(b),include.lowest=TRUE,labels=FALSE)

#Cross-tabulate a and b
table(a,b)

#Compute Cramer's V, Chi-square, degrees of freedom and significance
#supply the (maximum) number of digits you want for each value
cramers.v(data.frame(a,b),digits=c(3,2,0,5))
-- 
View this message in context: http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p25137957.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list