[R] select groups

arun smartpink111 at yahoo.com
Tue Feb 11 18:21:52 CET 2014


Hi,
Try:

set.seed(42)
 dat <-as.data.frame(matrix(sample(20:100,4*45,replace=TRUE),ncol=4))
set.seed(345)
 dat <- within(dat,class1 <- sample(letters[1:3],45,replace=TRUE) )
 table(dat$class1)*0.4
#
#  a   b   c 
#6.0 4.8 7.2 
set.seed(85)
res <- do.call(rbind,lapply(split(dat,dat$class1),function(x) x[sample(nrow(x),round(0.4*nrow(x)),replace=FALSE),]))
table(res$class1)

#a b c 
#6 5 7 

 row.names(res) <- 1:nrow(res)

res

A.K.


Hi, 

I have now a new question. Suppose that we have the data frame 

V1    V2    V3     V4       class 
23    32     65     33        a 
15    54     76     98        b 
21    23     98     23        a 
23    32     65     33        c 
15    54     76     98        b 
21    23     98     23        c 
23    32     65     33        a 
15    54     76     98        b 
21    23     98     23        c 
... 
and I need to select 40% (for example) to each class (consider that we have a lot of rows). 

Thanks



More information about the R-help mailing list