[R] folding table into a matrix

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Sep 23 22:38:31 CEST 2004


Gene Cutler <gcutler at amgen.com> writes:

> I'm just getting started with R, so feel free to point me to the
> appropriate documentation if this is already answered somewhere
> (though I've been unable to find it myself).  This does seem like a
> rather basic question.
> I want to fold a table into a matrix.  The table is formatted like so:
> 
> Column_Index  Value
> 1             486
> 2             688
> 3             447
> 4             555
> 5             639
> 1             950
> 2             881
> 3             1785
> 4             1216
> 1             612
> 2             790
> 3             542
> 4             1310
> 5             976
> 
> And I want to end up with something like this:
> 
>        [,1]  [,2]  [,3]  [,4]  [,5]
> [1,]   486   688   447   555   639
> [2,]   950   881  1785  1216    NA
> [3,]   612   790   512  1310   976
> 
> Since not all the rows are complete, I can't just reformat using
> matrix(), I need to go by the index information in the Column_Index
> column.  This seems like something simple to do, but I'm stumped.

It's not completely trivial, since you're relying on ordering
information: the missing col.5 value goes in the 2nd row, but you only
know that because values are ordered in row blocks.

If you supply the rows that things belong in, the task does becomes
simple:

Row_Index <- rep(1:3,c(5,4,5))
M <- matrix(NA, 3, 5)
M[cbind(Row_Index,Column_Index)] <- Value

Now how to compute Row_Index from Column_Index? If you know that each
group starts with a "1", you might use (rename Column_Index as cc for
brevity) 

> cumsum(cc==1)
 [1] 1 1 1 1 1 2 2 2 2 3 3 3 3 3

If you can't make that assumption, you might consider something like

> cumsum(c(1,diff(cc)<0))
 [1] 1 1 1 1 1 2 2 2 2 3 3 3 3 3


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list