[R] glmnet inclusion / exclusion of categorical variables

Kevin Shaney kevin.shaney at rosetta.com
Fri Aug 9 15:44:46 CEST 2013


Hello -

I have been using GLMNET of the following form to predict multinomial logistic / class dependent variables:

mglmnet=glmnet(xxb,yb ,alpha=ty,dfmax=dfm,
family="multinomial",standardize=FALSE)

I am using both continuous and categorical variables as predictors, and am using sparse.model.matrix to code my x's into a matrix.  This is changing an example categorical variable whose original name / values is {V1 = "1" or "2" or "3"} into two recoded variables {V12= "1" or "0" and V13 = "1" or "0"}.

As i am cycling through different penalties, i would like to either have both recoded variables included or both excluded, but not one included - and
can't figure out how to make that work.   I tried changing the
"type.multinomial" option, as that looks like this option should do what i want, but can't get it to work (maybe the difference in recoded variable names is driving this).

To summarize, for categorical variables, i would like to hierarchically constrain inclusion / exclusion of recoded variables in the model - either all of the recoded variables from the same original categorical  variable are in, or all are out.

Thanks!
Kevin

This e-mail message contains information that may be non...{{dropped:7}}



More information about the R-help mailing list