[Rd] scale.default gives an incorrect error message when is.numeric() fails on a dgeMatrix

Martin Maechler maechler at stat.math.ethz.ch
Thu Mar 1 18:52:46 CET 2018


>>>>> Michael Chirico <michaelchirico4 at gmail.com>
>>>>>     on Tue, 27 Feb 2018 20:18:34 +0800 writes:

Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not* sparse)

MM: modified to commented R code,  slightly changed from your post:


## I am attempting to use the lars package with a sparse input feature matrix,
## but the following fails:

library(Matrix)
library(lars)
data(diabetes) # from 'lars'
##UAagghh! not like this -- both attach() *and*   as.data.frame()  are horrific!
##UA  attach(diabetes)
##UA  x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
x <- as(unclass(diabetes$x), "dgCMatrix")
lars(x, y, intercept = FALSE)
## Error in scale.default(x, FALSE, normx) :
##   length of 'scale' must equal the number of columns of 'x'

## More specifically, scale.default fails as called from lars():
normx <- new("dgeMatrix",
  x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L),
  Dimnames = list(NULL,
                  c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
                    "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")))
scale.default(x, center=FALSE, scale = normx)
## Error in scale.default(x, center = FALSE, scale = normx) :
##   length of 'scale' must equal the number of columns of 'x'

>  The problem is that this check fails because is.numeric(normx) is FALSE:

>  if (is.numeric(scale) && length(scale) == nc)

>  So, the error message is misleading. In fact length(scale) is the same as
>  nc.

Correct, twice.

>  At a minimum, the error message needs to be repaired; do we also want to
>  attempt as.numeric(normx) (which I believe would have allowed scale to work
>  in this case)?

It seems sensible to allow  both 'center' and 'scale' to only
have to *obey*  as.numeric(.)  rather than fulfill is.numeric(.).

Though that is not a bug in scale()  as its help page has always
said that 'center' and 'scale' should either be a logical value
or a numeric vector.

For that reason I can really claim a bug in 'lars' which should
really not use

       scale(x, FALSE, normx)

but rather

       scale(x, FALSE, scale = as.numeric(normx))

and then all would work.

> -----------------

>  (I'm aware that there's some import issues in lars, as the offending line
>  to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x)) %*%
>  (x^2)))) is TRUE -- it's simply that lars doesn't import the appropriate S4
>  methods)

>  Michael Chirico

Yes, 'lars' has _not_ been updated since  Spring 2013, notably
because its authors have been saying (for rather more than 5
years I think) that one should really use 

 require("glmnet")

instead.

Your point is still valid that it would be easy to enhance
base :: scale.default()  so it'd work in more cases.

Thank you for that.  I do plan to consider such a change in
R-devel (planned to become R 3.5.0 in April).

Martin Maechler,
ETH Zurich



More information about the R-devel mailing list