[R] glmnet: converting coefficients back to original scale

Mark Seeto markseeto at gmail.com
Thu Mar 19 11:38:18 CET 2015


Dear R-help,

I'm having trouble understanding how glmnet converts its coefficient
estimates back to the original scale. Here's an example with ridge
regression:

########################################################################
library(glmnet)
set.seed(1)

n <- 20  # sample size

d <- data.frame(x1 = rnorm(n, 1, 1), x2 = rnorm(n, 10, 2), y = rnorm(n, 1, 2))

# Sample means
mx1 <- mean(d$x1)
mx2 <- mean(d$x2)
my <- mean(d$y)

# Scaling factors ("1/n standard deviations")
sx1 <- sd(d$x1)*sqrt((n - 1)/n)
sx2 <- sd(d$x2)*sqrt((n - 1)/n)
sy <- sd(d$y)*sqrt((n - 1)/n)

# Scaled variables
d$x1s <- (d$x1 - mx1)/sx1
d$x2s <- (d$x2 - mx2)/sx2
d$ys <- (d$y - my)/sy

lam <- 0.5  # lambda value

# Using scaled variables (same result for standardize=TRUE and
standardize=FALSE)
glmnet1 <- glmnet(as.matrix(d[, c("x1s", "x2s")]), d$ys, alpha=0, lambda = lam)

# Using unscaled variables
glmnet2 <- glmnet(as.matrix(d[, c("x1", "x2")]), d$y, alpha=0, lambda=lam)

coef(glmnet2)
##                     s0
## (Intercept)  2.5658491
## x1           0.3471199
## x2          -0.1703715

# I want to calculate the glmnet2 coef estimates from the glmnet1 coef
estimates.
# The following attempts are based on rearrangement of
# (y - my)/sy = beta1*(x1 - mx1)/sx1 + beta2*(x2 - mx2)/sx2

my - coef(glmnet1)["x1s", "s0"]*mx1*sy/sx1 - coef(glmnet1)["x2s",
"s0"]*mx2*sy/sx2
# 2.430971
# Not the same as coef(glmnet2)["(Intercept)", "s0"]

coef(glmnet1)["x1s", "s0"]*sy/sx1
# 0.3096897
# Not the same as coef(glmnet2)["x1", "s0"]

coef(glmnet1)["x2s", "s0"]*sy/sx2
# -0.1524043
# Not the same as coef(glmnet2)["x2", "s0"]

######################################################################

I can apply a similar method (with centring of y instead of
standardisation) to successfully get the coefficient estimates on the
original scale given by lm.ridge in the MASS package. I would
appreciate any help anyone can give on where I'm going wrong with
glmnet.

Thanks,
Mark



More information about the R-help mailing list