[R] Lasso Regression error

David Winsemius dwinsemius at comcast.net
Sun May 5 20:13:36 CEST 2013


On May 4, 2013, at 10:26 PM, Preetam Pal wrote:

> Thanks David for the paper, I understand the theory.
> 
> But my question is about R only: the vector of coefficients that R outputs in lars(), does it apply against the original variable y or against (y-y_bar). I have put in intercept=T as well in my lars() model.
> I need this information to calculate the residuals.
> 
> To illustrate my point:
> 
> I put lasso=lars(x,y,intercept=T)
> 
> R gives me the coefficient beta.
> 
> Does this mean the model is y=x*beta
> or is it       (transformed y) = beta*(transformed x)?
> 
> I guess R first transforms the variables, finds the optimum beta and then readjusts the estimates to fit the original x and y variables. I am a bit confused, because in this case, R should have returned something (a function of  x_bar and y_bar) as the intercept (which it clearly does not).I am not able to find any documentation on this.

This suggests you have not actually looked at the object returned. In particular:

lasso[c("normx", "meanx")]
#---------
$normx
[1] 6.198815e+05 2.786535e+02 6.202666e+00

$meanx
           g            h            u 
8.519395e+05 5.469499e+02 7.519258e+00 


attributes(lasso2[["beta"]])
#------------------
$dim
[1] 4 3

$dimnames
$dimnames[[1]]
[1] "0" "1" "2" "3"

$dimnames[[2]]
[1] "g" "h" "u"


$`scaled:scale`
[1] 6.198815e+05 2.786535e+02 6.202666e+00

(I wouldn't have expected a y_bar to be returned myself.)

-- 
David.

> Appreciate your help on this.
> Thanks, 
> Preetam
> 
> 
> On Sun, May 5, 2013 at 12:55 AM, David Winsemius <dwinsemius at comcast.net> wrote:
> 
> On May 4, 2013, at 10:13 AM, Preetam Pal wrote:
> 
> > Hi,
> > I rectified my error (thanks David for pointing it out)
> > Now I have been able to run the code:
> >
> > data=read.table("data.txt", header=T)
> > > l=data$LOSS
> > > h=data$HPI
> > > u=data$UE
> > > g=data$GDP
> > >
> > > matrix=cbind(g,h,u)
> > > lasso=lars(matrix,l)
> > >
> >
> > The final set of coefficients for the regression is the last row of coef(lasso). Am I right?
> > Plus what happens to the intercept estimate? It is not available in coef(lasso).
> 
> Please read the cited documentation  ... top of page 3:
> http://www-stat.stanford.edu/~hastie/Papers/LARS/LeastAngle_2002.pdf
> 
> " By location and scale transformations we can always assume that the covariates have been standardized to have mean 0 and unit length, and that the response has mean 0,"
> 
> Hence no need for an Intercept.
> 
> --
> David.
> >
> > Any help is welcome.
> >
> > Thanks,
> > Preetam
> >
> >
> > On Sat, May 4, 2013 at 9:52 PM, David Winsemius <dwinsemius at comcast.net> wrote:
> >
> > On May 4, 2013, at 6:09 AM, Preetam Pal wrote:
> >
> > > Hi all,
> > > I have a data set containing variables LOSS, GDP, HPI and UE.
> > > (I have attached it in case it is required).
> > >
> > > Having renamed the variables as l,g,h and u, I wish to run a Lasso
> > > Regression with l as the dependent variable and all  the other 3 as the
> > > independent variables.
> > >
> > > data=read.table("data.txt", header=T)
> > > l=data$LOSS
> > > h=data$HPI
> > > u=data$UE
> > > g=data$GDP
> > >
> > > matrix=data.frame(l,g,h,u)
> > > lasso=lars(matrix,l)
> > >
> > > But R is throwing an error (shown below) at this:
> > >
> > > Error in rep(1, n) : invalid 'times' argument
> >
> > I get a different error using package:lars version 1.1 but the problem is likely that same. You created an object named `matrix` which is not a matrix. You apparently expected `lars` to recognize your intent. It didn't.  (You also included your response variable in your set of predictors. `lars` will run this without error, but treats it like a tautology. ) Try offering the types of R objects that `lars` is documented to accept.
> >
> > >
> > > Can you kindly suggest where I went wrong?
> > >
> > > [Just wanted to mention that I am getting the same error when instead of
> > > the matrix of  predictor variables, I am using only a single variable, say,
> > > g : lasso=lars(g,l)]
> > >
> > > Appreciate any help.
> > >
> >
> 
> 
> David Winsemius
> Alameda, CA, USA

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list