[Rd] scale.default gives an incorrect error message when is.numeric() fails on a dgeMatrix

Michael Chirico michaelchirico4 at gmail.com
Fri Mar 2 01:27:07 CET 2018


thanks. I know the setup code is a mess, just duct-taped something together
from the examples in lars (which are a mess in turn). in fact when I
messaged Prof. Hastie he recommended using glmnet. I wonder why lars is
kept on CRAN if they've no intention of maintaining it... but I digress...

On Mar 2, 2018 1:52 AM, "Martin Maechler" <maechler at stat.math.ethz.ch>
wrote:

> >>>>> Michael Chirico <michaelchirico4 at gmail.com>
> >>>>>     on Tue, 27 Feb 2018 20:18:34 +0800 writes:
>
> Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not*
> sparse)
>
> MM: modified to commented R code,  slightly changed from your post:
>
>
> ## I am attempting to use the lars package with a sparse input feature
> matrix,
> ## but the following fails:
>
> library(Matrix)
> library(lars)
> data(diabetes) # from 'lars'
> ##UAagghh! not like this -- both attach() *and*   as.data.frame()  are
> horrific!
> ##UA  attach(diabetes)
> ##UA  x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
> x <- as(unclass(diabetes$x), "dgCMatrix")
> lars(x, y, intercept = FALSE)
> ## Error in scale.default(x, FALSE, normx) :
> ##   length of 'scale' must equal the number of columns of 'x'
>
> ## More specifically, scale.default fails as called from lars():
> normx <- new("dgeMatrix",
>   x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L),
>   Dimnames = list(NULL,
>                   c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
>                     "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")))
> scale.default(x, center=FALSE, scale = normx)
> ## Error in scale.default(x, center = FALSE, scale = normx) :
> ##   length of 'scale' must equal the number of columns of 'x'
>
> >  The problem is that this check fails because is.numeric(normx) is FALSE:
>
> >  if (is.numeric(scale) && length(scale) == nc)
>
> >  So, the error message is misleading. In fact length(scale) is the same
> as
> >  nc.
>
> Correct, twice.
>
> >  At a minimum, the error message needs to be repaired; do we also want to
> >  attempt as.numeric(normx) (which I believe would have allowed scale to
> work
> >  in this case)?
>
> It seems sensible to allow  both 'center' and 'scale' to only
> have to *obey*  as.numeric(.)  rather than fulfill is.numeric(.).
>
> Though that is not a bug in scale()  as its help page has always
> said that 'center' and 'scale' should either be a logical value
> or a numeric vector.
>
> For that reason I can really claim a bug in 'lars' which should
> really not use
>
>        scale(x, FALSE, normx)
>
> but rather
>
>        scale(x, FALSE, scale = as.numeric(normx))
>
> and then all would work.
>
> > -----------------
>
> >  (I'm aware that there's some import issues in lars, as the offending
> line
> >  to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x))
> %*%
> >  (x^2)))) is TRUE -- it's simply that lars doesn't import the
> appropriate S4
> >  methods)
>
> >  Michael Chirico
>
> Yes, 'lars' has _not_ been updated since  Spring 2013, notably
> because its authors have been saying (for rather more than 5
> years I think) that one should really use
>
>  require("glmnet")
>
> instead.
>
> Your point is still valid that it would be easy to enhance
> base :: scale.default()  so it'd work in more cases.
>
> Thank you for that.  I do plan to consider such a change in
> R-devel (planned to become R 3.5.0 in April).
>
> Martin Maechler,
> ETH Zurich
>
>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list