[R] [R-pkgs] glmnet 1.9-3 uploaded to CRAN (with intercept option)

Trevor Hastie hastie at stanford.edu
Sat Mar 2 02:56:11 CET 2013


This update adds an intercept option (by popular request) - now one can fit a model without an intercept

Glmnet is a package that fits the regularization path for a number of generalized linear models, with  with "elastic net" 
regularization (tunable mixture of L1 and L2 penalties). Glmnet uses pathwise coordinate descent, and is very fast.

The current list of models covered are:

least squares linear regression
binary logistic regression
multinomial logistic regression (grouped and ungrouped)
poisson regression
multi-response linear regression (grouped)
Cox proportinal-hazards model


Some of the features of glmnet:

* By default it computes the path at 100 uniformly spaced (on the log scale) values of the regularization parameter lambda. 
  Alternatively users can provide their own values of lambda
* Recognizes and exploits sparse input matrices (ala Matrix package; this feature not yet implemented for Cox family).
* Coefficient matrices are output in sparse matrix representation.
* Penalty is (1-a)*||\beta||_2^2 +a*||beta||_1  where a is between 0 and 1;  a=0 is the Lasso penalty, a=1 is the ridge penalty.
  For many correlated predictors, a=.95 or thereabouts improves the performance of the lasso.
* Convenient predict, plot, print, and coef methods
* Variable-wise penalty modulation allows each variable to be penalized by a scalable amount; if zero that variable always enters
* Some variables can be excluded (a convenience option)
* Glmnet uses a symmetric parametrization for multinomial, with constraints enforced by the penalization. 
  When the "grouped" option is used, it selects in or out all the class coefficients for a variable together.
* A comprehensive set of cross-validation routines are provided for all models and several error measures;  These include deviance,
  mean absolute error, misclassification error and "auc" for logistic or multinomial models. 
* Offsets and weights can be provided for all models
* Upper and lower bounds can be imposed on each of the coefficients
* An intercept option allows for models to be fit with or without intercepts.
* A standardize option allows for variable standardization
* A number of control parameters can be set in the calling function. In addition, a function glmnet.control allows users to set 
  some internal control variables for the entire session.
* Uses strong rules for speeding up convergence (by temporarily limiting the active set).

Examples of glmnet speed trials:
Newsgroup data: N=11,000, p= 0.75 Million, two class logistic. 100 values along lasso path.   Time = 2mins
14 Class cancer data: N=144, p=16K, 14 class multinomial, 100 values along lasso path. Time = 30secs

Authors: Jerome Friedman, Trevor Hastie, Rob Tibshirani and  Noah Simon

References:
     Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization
     Paths for Generalized Linear Models via Coordinate Descent
     http://www.stanford.edu/~hastie/Papers/glmnet.pdf>
     Journal of Statistical Software, Vol. 33(1), 1-22 Feb 2010
     http://www.jstatsoft.org/v33/i01/

     Simon, N., Friedman, J., Hastie, T., Tibshirani, R. (2011)
     Regularization Paths for Cox's Proportional Hazards Model via
     Coordinate Descent, Journal of Statistical Software, Vol. 39(5)
     1-13  http://www.jstatsoft.org/v39/i05/

     Tibshirani, Robert., Bien, J., Friedman, J.,Hastie, T.,Simon,
     N.,Taylor, J. and Tibshirani, Ryan. (2010) Strong Rules for
     Discarding Predictors in Lasso-type Problems,
     http://www-stat.stanford.edu/~tibs/ftp/strong.pdf

 ----------------------------------------------------------------------------------------
  Trevor Hastie                                   hastie at stanford.edu  
  Professor, Department of Statistics, Stanford University
  Phone: (650) 725-2231                 Fax: (650) 725-8977  
  URL: http://www.stanford.edu/~hastie  
   address: room 104, Department of Statistics, Sequoia Hall
           390 Serra Mall, Stanford University, CA 94305-4065  
 --------------------------------------------------------------------------------------




	[[alternative HTML version deleted]]

_______________________________________________
R-packages mailing list
R-packages at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages



More information about the R-help mailing list