[R] Reshaping matrix of vectors as dataframe

William Dunlap wdunlap at tibco.com
Sun Jan 31 19:47:43 CET 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Oliver Gondring
> Sent: Sunday, January 31, 2010 6:53 AM
> To: r-help at r-project.org
> Subject: [R] Reshaping matrix of vectors as dataframe
> 
> Dear R people,
> 
> I have to deal with the output of a function which comes as a 
> matrix of 
> vectors.
> You can reproduce the structure as given below:
> 
> x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
>            c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
>            c(3,7),c(1,2),c(0,1))
> data <- matrix(x,byrow=TRUE,nrow=3)
> colnames(data) <- c("First", "Length", "Value")
> rownames(data) <- c("Case1", "Case2", "Case3")
> 
>  > data
>        First     Length    Value
> Case1 Numeric,3 Numeric,3 Numeric,3
> Case2 Numeric,4 Numeric,4 Numeric,4
> Case3 Numeric,2 Numeric,2 Numeric,2
> 
>  > data["Case1",]
> $First
> [1] 1 2 4
> 
> $Length
> [1] 1 3 5
> 
> $Value
> [1] 0 1 0
> --------------------
> 
> My goal now is to break the three vectors of each row of the 
> matrix into 
> their elements, assigning each element to a certain 
> "Sequence" (which I 
> want to be numbered according to the position of the corresponding 
> values within the vectors), reshaping the whole as a data 
> frame like this:
> 
> Case	Sequence	First	Length	Value
> 
> Case1	1		1	1	0
> Case1	2		2	3	1
> Case1	3		4	5	0
> 
> Case2	1		1	3	0
> Case2	2		3	4	1
> Case2	3		6	4	0
> Case2	4		5	4	1
> 
> Case3	1		3	1	0			
> Case3	2		7	2	1

The following is not terribly elegant, but is
pretty easy to understand.
  > lengths<-sapply(data[,1],length)
  > data.frame(Case=rep(rownames(data),lengths),
   +            Sequence=sequence(lengths),  
  +            apply(data,2,unlist),
  +            row.names=NULL)
     Case Sequence First Length Value
  1 Case1        1     1      1     0
  2 Case1        2     2      3     1
  3 Case1        3     4      5     0
  4 Case2        1     1      3     0
  5 Case2        2     3      4     1
  6 Case2        3     6      4     0
  7 Case2        4     5      4     1
  8 Case3        1     3      1     0
  9 Case3        2     7      2     1
It assumes that sapply(data[,k],length) is the
same for all k in 1:ncol(data).   If you do this
often put it into a function that makes that
check.

It uses the much-maligned (by me) apply() function so
it wastes effort "simplifying" the results of unlist
into the columns of a matrix that data.frame() will
immediately pull apart into columns.  The following
avoids apply() but is wordier
  > data.frame(Case=rep(rownames(data),lengths),
  +            Sequence=sequence(lengths),
  +            lapply(split(data,colnames(data)[col(data)]), unlist),
  +            row.names=NULL)
     Case Sequence First Length Value
  1 Case1        1     1      1     0
  2 Case1        2     2      3     1
  3 Case1        3     4      5     0
  4 Case2        1     1      3     0
  5 Case2        2     3      4     1
  6 Case2        3     6      4     0
  7 Case2        4     5      4     1
  8 Case3        1     3      1     0
  9 Case3        2     7      2     1

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> I suspect that there might be an elegant and not too 
> complicated way to 
> do this with one or several of the functions provided by the 
> 'reshape' 
> package, but due to my lack of experience with R in general, this 
> package in particular and the complexity of the task I wasn't able to 
> figure out how to do it so far.
> 
> Every hint or helpful comment is much appreciated!
> 
> Oliver
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list