[R] reshaping some data

Sundar Dorai-Raj sundar.dorai-raj at PDF.COM
Tue Sep 14 18:15:46 CEST 2004


Hi all,
   I have a data.frame with the following colnames pattern:

x1 y11 x2 y21 y22 y23 x3 y31 y32 ...

I.e. I have an x followed by a few y's. What I would like to do is turn 
this wide format into a tall format with two columns: "x", "y". The 
structure is that xi needs to be associated with yij (e.g. x1 should 
next to y11 and y12, x2 should be next to y21, y22, and y23, etc.).

  x   y
x1 y11
x2 y21
x2 y22
x2 y23
x3 y31
x3 y32
...

I have looked at ?reshape but I didn't see how it could work with this 
structure. I have a solution using nested for loops (see below), but 
it's slow and not very efficient. I would like to find a vectorised 
solution that would achieve the same thing.

Now, for an example:

x <- data.frame(x1 =  1: 5, y11 =  1: 5,
                 x2 =  6:10, y21 =  6:10, y22 = 11:15,
                 x3 = 11:15, y31 = 16:20,
                 x4 = 16:20, y41 = 21:25, y42 = 26:30, y43 = 31:35)
# which are the x columns
nmx <- grep("^x", names(x))
# which are the y columns
nmy <- grep("^y", names(x))
# grab y values
y <- unlist(x[nmy])
# reserve some space for the x's
z <- vector("numeric", length(y))
# a loop counter
k <- 0
n <- nrow(x)
seq.n <- seq(n)
# determine how many times to repeat the x's
repy <- diff(c(nmx, length(names(x)) + 1)) - 1
for(i in seq(along = nmx)) {
   for(j in seq(repy[i])) {
     # store the x values in the appropriate z indices
     z[seq.n + k * n] <- x[, nmx[i]]
     # move to next block in z
     k <- k + 1
   }
}
data.frame(x = z, y = y, row.names = NULL)




More information about the R-help mailing list