[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.frame

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Feb 11 16:47:49 CET 2005


You too have not give an reproducible example!

If you have a corrupt data frame, the function may fail, which is what 
happened in the PR# you quote.

Please note: you should not be calling as.matrix.data.frame, but as.matrix.

On Fri, 11 Feb 2005, Gorjanc Gregor wrote:

> Hello R developers.
>
> I encountered the same problem as Uwe Ligges with as.matrix.data.frame()
> in bug reports 3229 and 3242 - under section not-reproducible.
>
> Example I have is:
>
>> tmp
>                             level 2100-D
> 1       biological_process unknown     NA
> 2                 cellular process  -5.88
> 3                      development  -8.42
> 4            physiological process  -6.55
> 5 regulation of biological process     NA
> 6                 viral life cycle     NA
>
>> str(tmp)
> `data.frame':   6 obs. of  2 variables:
> $ level      : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6
> $ 2100-D_mean:`data.frame':    6 obs. of  1 variable:
>  ..$ 2100-D: num  NA -5.88 -8.42 -6.55 NA NA

I think you have a data frame column in a data frame, and that cannot be 
made directly into a matrix.  It's the steps that got you here that are 
the problem.

>> as.matrix.data.frame(tmp)
> Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not
> match the length of object [7]
>
> The error associated with this is comming up at the end of function
> as.matrix.data.frame where it is used:
>
>    dim(X) <- c(n, length(X)/n)
>
> ?dim says
>     'dim' has a method for 'data.frame's, which returns the length of
>     the 'row.names' attribute of 'x' and the length of 'x' (the
>     numbers of "rows" and "columns").
>
> This part is ok. The problem is with X, which is "intensively"
> modified through the function. Before this (dim(X) <- ...) call
> X in my case is:
>
>> x <- tmp
>> "code from as.matrix.data.frame down to dim(X) <- ..."
>> X
> [[1]]
> [1] "biological_process unknown"
>
> [[2]]
> [1] "cellular process"
>
> [[3]]
> [1] "development"
>
> [[4]]
> [1] "physiological process"
>
> [[5]]
> [1] "regulation of biological process"
>
> [[6]]
> [1] "viral life cycle"
>
> [[7]]
> [1]    NA -5.88 -8.42 -6.55    NA    NA
>
> So we can see, that X is somehow destroyed - the first and second
> column of tmp differ. For dim command this should really be one
> long vector. So the problem lies in line
>
>    X <- unlist(X, recursive = FALSE, use.names = FALSE)
>
> where it should be
>
>    X <- unlist(X, recursive = TRUE, use.names = FALSE)
>                               ^^^^
>
> I have checked source code for that function from R as well as
> in R-devel sources. I was not succesfull in reproducing the above
> with the data frame bellow though. It did not report any problems
> with old as.matrix.data.frame. There must be some trick with
> first column in my data. So I am quite sure my suggestion is
> OK.
>
> tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8))
>
> --
> Lep pozdrav / With regards,
>    Gregor GORJANC
>
> ---------------------------------------------------------------
> University of Ljubljana
> Biotechnical Faculty       URI: http://www.bfro.uni-lj.si
> Zootechnical Department    email: gregor.gorjanc <at> bfro.uni-lj.si
> Groblje 3                  tel: +386 (0)1 72 17 861
> SI-1230 Domzale            fax: +386 (0)1 72 17 888
> Slovenia
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list