[Rd] possible bug in model.matrix

Peter Dalgaard p.dalgaard at biostat.ku.dk
Tue Sep 13 22:37:10 CEST 2005


"Whit Armstrong" <whit at twinfieldscapital.com> writes:

> Is this a bug, or have I misunderstood the proper use of lm?

Dunno. It appears that logicals like factors are not supposed to have
matrix structure. What actually happens is that setting contrasts
strips dimension attributes
 
Browse[1]>
debug: for (nn in namD[isF]) if (is.null(attr(data[[nn]], "contrasts"))) contrasts(data[[nn]]) <- contr.funs[1 +
    isOF[nn]]
Browse[1]> zz <- data[["y"]]
Browse[1]> contrasts(zz) <-  contrasts(zz)
Browse[1]> zz
  [1] TRUE  TRUE  TRUE  FALSE TRUE  TRUE  TRUE  TRUE  TRUE  FALSE TRUE FALSE
 [13] FALSE TRUE  TRUE  TRUE  FALSE TRUE  FALSE TRUE  TRUE  FALSE FALSE TRUE
 [25] TRUE  TRUE  FALSE FALSE FALSE FALSE TRUE  FALSE FALSE TRUE  TRUE FALSE
 [37] FALSE FALSE FALSE TRUE  TRUE  TRUE  TRUE  FALSE FALSE FALSE TRUE
..
 [85] TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  FALSE TRUE  TRUE  TRUE  TRUE
 [97] TRUE  TRUE  TRUE  FALSE
Levels: FALSE TRUE
 
which in turn comes from

if (is.logical(x)) x <- factor(x, levels = c(FALSE, TRUE))

and the fact that factor() throws away dimensions.

*If* it's a bug, I don't think it is easily fixable....

> Thanks,
> Whit
> 
> 
> code:
> x <- rnorm(50)
> y <- matrix(as.logical(round(runif(100),0)),ncol=2)
> NROW(x)==NROW(y)
> lm(x~y)
> 
> 
> 
> > x <- rnorm(50)
> > y <- matrix(as.logical(round(runif(100),0)),ncol=2)
> > NROW(x)==NROW(y)
> [1] TRUE
> > lm(x~y)
> Error in "[[<-.data.frame"(`*tmp*`, nn, value = c(2, 1, 2, 1, 1, 1, 2,
> :
>         replacement has 100 rows, data has 50
> >
> 
> 
> However, the call to lm works if the matrix is a numeric instead of
> logical:
> x <- rnorm(50)
> y <- matrix(runif(100),ncol=2)
> NROW(x)==NROW(y)
> lm(x~y)
> 
> 
> Seems to be a problem in model.matrix.default:
> 
> debug: for (nn in namD[isF]) if (is.null(attr(data[[nn]], "contrasts")))
> contrasts(data[[nn]]) <- contr.funs[1 +
>     isOF[nn]]
> Browse[1]>
> Error in "[[<-.data.frame"(`*tmp*`, nn, value = c(1, 2, 2, 2, 2, 2, 2,
> :
>         replacement has 100 rows, data has 50
> >
> 
> 
> > R.Version()
> $platform
> [1] "i686-pc-linux-gnu"
> 
> $arch
> [1] "i686"
> 
> $os
> [1] "linux-gnu"
> 
> $system
> [1] "i686, linux-gnu"
> 
> $status
> [1] "alpha"
> 
> $major
> [1] "2"
> 
> $minor
> [1] "2.0"
> 
> $year
> [1] "2005"
> 
> $month
> [1] "09"
> 
> $day
> [1] "12"
> 
> $"svn rev"
> [1] "35558"
> 
> $language
> [1] "R"
> 
> >
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list