[R] Categorical data

Sat Nov 3 13:21:58 CET 2007

Dear Sirs

What is the best aproximation to the standardized normal distribution:

  necessidade = c("sem necessidade","com necessidade")
  tipo =c("CE-1", "CE-2", "CE-3")
  dados=c(20,34,44,69,9,3)
Tabela =cbind(expand.grid(list(Necessidade=necessidade, Tipo=tipo)), 
count=dados)
Tabela.array=tapply(Tabela$count, Tabela[,1:2], sum)
  ni = rowSums(Tabela.array)
  nj = colSums(Tabela.array)
n = sum(Tabela.array)
fit.glm=glm(count~Necessidade+Tipo, data=Tabela, family=poisson)

#############
chisq.test(Tabela.array)
############
resid.pear=residuals(fit.glm, type="pearson") %%% This one? (no 
residuals are outside the range -1,96 to 1,96
###########
resid.pear.mat=matrix(resid.pear, nc=3, byrow=F,dimnames=list(c("sem 
necessidade","com necessidade"),c("CE-1", "CE-2","CE-3")))
n*resid.pear.mat/sqrt(outer(n-ni,n-nj,"*")) %%%  Or this one? 
(residuals are outside the range -1,96, 1,96)

Thanks

Jorge