[R] poLCA : Is maximum number of variables limited?

KDT dkadengye at gmail.com
Tue May 29 11:51:39 CEST 2012


Dear R-users,

I keep getting an ERROR saying " *Error in model.matrix.default(formula,
mframe) :  model frame and formula mismatch in model.matrix()* " when i fit
poLCA with more than 63 variables. Below are the details.

I am trying to do a Latent Class Analysis using poLCA. My data set contains
binary scores of, for instance, 200 students on 100 items. These numbers
could even be more in due course. The resulting dataframe on which i want to
perfrom LCA looks like shown below (first five person).  Each row
corresponds to scores of a person on the 100 items.

item1 item2 item3 item4 item5 item6   . . . . . .   item97  item98  item99 
item100
    1          1          0          1           1         1        . . . .
. .           1             0            1              1
    0          0          0          0           1         1        . . . .
. .           1             0            1              0
    0          1          0          1           0         1        . . . .
. .            0             1           0              1
    1          1          0          1           1         1        . . . .
. .           1             0            1              1
    1          1          0          1           1         1        . . . .
. .           1             1            1              1

On this dataframe (here named datax), i perform LCA as follows:

datax.int=datax+1  ### poLCA can only analyze 1,2,...
f<-as.formula(paste("cbind(", paste(colnames(datax.int), collapse = ","),
")~1")) #all items are dependent variables
fit<-list() #collect fits
Kmax=5 #maximum nr of classes
bic=rep(0,Kmax) #vector of BIC values
ll=rep(0,Kmax) #vector of loglikelihood values
for (j in 1:Kmax){ #fits for #classes=1,2,...,Kmax
  cat(j,"\n") #print current analysis number
  fit[[j]]<-poLCA(f,data.int,nclass=j,nrep=20,verbose=FALSE) #20 random
starts
  bic[j]<-fit[[j]]$bic #collect BICs
  ll[j] <- fit[[j]]$llik #collect logliks
}

Then I get an ERROR saying " Error in model.matrix.default(formula, mframe)
:  
  model frame and formula mismatch in model.matrix() "

What is confusing me is that the macro runs just fine when the number of
items is restricted to 63  or less. I have checked this for 200 and 500
persons.  If the number of columns (items) is 63 or less, i do not get an
error. 
Mind you, my dataset can contain hundreds of items from thousands of 
persons.
I wonder where I am going wrong. 

Any ideas? Thank you in advance!!

Trevor



--
View this message in context: http://r.789695.n4.nabble.com/poLCA-Is-maximum-number-of-variables-limited-tp4631670.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list