[R] cv.glmnet errors

Loren Collingwood loren.collingwood at gmail.com
Sun Mar 6 07:59:51 CET 2011


I came across the same thing, doing multinomial cross validation with 
cv.glmnet but also doing it with a for loop with subsets on the X matrix and 
y response categories. I've tested it out various ways and I think the 
problem occurs because in one of the folds there are no codes for at least 
one of the responses. From what I gather, this trips up glmnet. See in the 
table code below where in the first case no zeroes appear, but in the second 
a zero appears.

rand <- sample(3,dim(alldata)[1], replace=T) # alldata is a dataframe; 
allcodes is factor response variables

obj1 <- glmnet(x=alldata[rand!=2,],y=allcodes[rand!=2], 
family="multinomial",maxit=500) #Worked
obj2 <- glmnet(x=alldata[rand!=3,],y=allcodes[rand!=3], 
family="multinomial",maxit=500)  #doesn't work


> table(allcodes[rand!=2])

 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
84 31 14 67  8  9  8 16 31  5 11  3 35  3  9  7  2 17 18 12  3  1  4  1 
> table(allcodes[rand!=3])

 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
85 20 14 72 12  7 13 15 32  4 13  3 26  3 15  5  6 13 23 16  1  0  3  1 

I've looked at this with various sequences and it always seems to work when 
there's no zeroes, and crashes when there are zeroes. I'm working on a small 
data frame here (because of memory issues) so I don't think in general I 
would have 0s in nfold code categories.

-Loren


More information about the R-help mailing list