[R] how to apply sample function to each row of a data frame?

Mon Nov 22 10:54:02 CET 2010

On Sun, Nov 21, 2010 at 12:43:21PM -0800, wangwallace wrote:
> here is the data frame:
> 
>        a   b  c   A  B   C
> [1,]  1   2  3   4  5   6
> [2,]  7   8  9  10 11 12
> [3,] 13 14 15 16 17 18
> 
> a, b, c are type I variables
> A, B, C are type II variables 
> each row represent the data from one subject
> 
> my purpose is to create a new data frame in which:
> 
> 1) in each row, there are one random number from type I variables, and two
> random numbers from type II variables
> 
> 2) meanwhile, in each row, the two type II numbers have to be only those
> numbers that are not corresponding to the type I number. For example, if the
> type I number is 1, the type II numbers should not include 4.
> 
> 3) type I number and type II numbers in each row should be all from the same
> subject.
> 
> the new data frame should be like this:
> 
>        [,1] [,2] [,3]
> [1,]    I     II    II
> [2,]    I     II    II
> [3,]    I     II    II 

If the two type II objects in a row should be always different, then
this may be computed for example as follows.

  # prepare the input

  A <- matrix(1:18, ncol=6, byrow=TRUE)
  colnames(A) <- c(letters[1:3], LETTERS[1:3])

  # prepare random indices for each row

  ind <- t(replicate(nrow(A), sample(3)))

  # construct the output without a cycle

  col1 <- A[cbind(seq(nrow(A)), ind[, 1])]
  col2 <- A[cbind(seq(nrow(A)), 3 + ind[, 2])]
  col3 <- A[cbind(seq(nrow(A)), 3 + ind[, 3])]
  B <- cbind(col1, col2, col3)

  # or with a cycle over rows

  C <- matrix(nrow=nrow(A), ncol=3)
  for (i in seq(nrow(A))) {
      C[i, 1] <- A[i, ind[i, 1]]
      C[i, 2:3] <- A[i, 3 + ind[i, 2:3]]
  }

Petr Savicky.