[R] Standardisation of variables with Lasso (glmnet)

Thu Feb 13 21:23:00 CET 2014

   Dear all,

   I am working with glmnet but the problem arises also in all other Lasso
   implementations:
   It is ususally recommended to standardize the variables / use intercept and
   this works well with the implemented options:

   x <- matrix(rnorm(10000), ncol=50)
   y <- rnorm(200)
   cv.out =cv.glmnet(x,y, alpha =1, intercept=T , standardize=T)
   coef <- coef(cv.out, s = "lambda.min")
   ind1 <- which(coef>0)
   coef[ind1,]

   but when I would like to do this by hand:

   xs <- apply(x,2, function(x) (x-mean(x))/sqrt(var(x)))
   ys <- y - mean(y)
   cv.out =cv.glmnet(xs,ys, alpha =1, intercept=F , standardize=F)
   coef <- coef(cv.out, s = "lambda.min")
   ind1 <- which(coef>0)
   coef[ind1,]

   The following error appears:

   > cv.out =cv.glmnet(xs,ys, alpha =1, intercept=F , standardize=F)
   Error in elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian,  :
                    NA/NaN/Inf in foreign function call (arg 5)

   Therefore  my  question is what am I doing wrong and what is the "best
   practice" with Lasso (intercept yes / no, standardisation by hand, ...)

   Thank you very much for your efforts and replies in advance!

   Best,

   Martin