[R] merging

Sundar Dorai-Raj sundar.dorai-raj at pdf.com
Tue May 30 22:09:27 CEST 2006



Gavin Simpson wrote:
> Dear List,
> 
> Given,
> 
> y <- matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE)
> rownames(y) <- c("a","b","c")
> colnames(y) <- c("1","2","3")
> y
> y2 <- y[2:3, ]
> rownames(y2) <- c("x","z")
> y2
> 
> how can I stop
> 
> merge(y, y2, all = TRUE, sort = FALSE)
> 
> squishing the extra rows? Ideally I want the same as:
> 
> rbind(y, y2)
> 
> in this case. This is specific example of situation where two data
> matrices have same column variables and all I want is to stick the two
> sets of rows together, but I have been using merge for cases such as the
> one below, where the second matrix has extra column(s):
> 
> y3 <- matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE)
> rownames(y3) <- c("d","e","f")
> colnames(y3) <- c("1","2","3","4")
> y3
> merge(y, y3, all = TRUE, sort = FALSE)
> 
> We don't know before hand if the columns will match. But I see now that
> even this doesn't work as I was expecting/thinking!
> 
> So I'm looking for a general way to merge two matrices such that the
> number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the
> number of columns in the merged matrix is length(unique(colnames(mat1),
> colnames(mat2).
> 
> Is there a function in R to do this, or can someone suggest a way to
> achieve this? My R version info is at the end.
> 
> Just to be clear, for the y, y3 example I want something like this
> returned:
> 
>   1 2 3 4
> a 0 1 1 NA
> b 1 0 0 NA
> c 0 4 4 NA
> d 0 1 1 1
> e 0 0 0 4
> f 4 5 6 7
> 
> and for the y, y2 example, I want something like this returned:
> 
>   1 2 3
> a 0 1 1
> b 1 0 0
> c 0 4 4
> x 1 0 0
> z 0 4 4
> 
> Many thanks,
> 
> Gav
> 
> 
>>version
> 
>                _
> platform       i686-pc-linux-gnu
> arch           i686
> os             linux-gnu
> system         i686, linux-gnu
> status         Patched
> major          2
> minor          3.0
> year           2006
> month          05
> day            03
> svn rev        37978
> language       R
> version.string Version 2.3.0 Patched (2006-05-03 r37978)


Will this help:

rbind.all <- function(...) {
   x <- list(...)
   cn <- unique(unlist(lapply(x, colnames)))
   for(i in seq(along = x)) {
     if(any(m <- !cn %in% colnames(x[[i]]))) {
       na <- matrix(NA, nrow(x[[i]]), sum(m))
       dimnames(na) <- list(rownames(x[[i]]), cn[m])
       x[[i]] <- cbind(x[[i]], na)
     }
   }
   do.call(rbind, x)
}

y <- matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE)
rownames(y) <- c("a","b","c")
colnames(y) <- c("1","2","3")
y2 <- y[2:3, 2:3]
rownames(y2) <- c("x","z")
y3 <- matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE)
rownames(y3) <- c("d","e","f")
colnames(y3) <- c("1","2","3","4")

rbind.all(y, y2, as.data.frame(y3))

It does very little error-checking, so be careful how you use it.

--sundar



More information about the R-help mailing list