[R] Sampling with Constraints for testing and training data

Eliano eliano.m.marques at gmail.com
Wed Jan 25 13:00:27 CET 2012


Hi People, 

Does anyone have a good solution for this problem: 

a database called DB. 


index <- sample(1:nrow(DB), size=0.2*nrow(BD)) 
test <- DB[index,] 
train <- DB[-index,] 

One of the variables in this database contais a target variable with two
values 0 and 1. 

Imagine now that i want to constraint the test data frame so the 20% of the
size of "test" has 50% of DB$target. 

Imagine: n=100 
DB$target = { 0=80 
                           1=20} 

test=20 and contain 10 random values of DB$target=1 and 10 random values of
DB$target=0. 



Many Thanks, 
Eliano 



--
View this message in context: http://r.789695.n4.nabble.com/Sampling-with-Constraints-for-testing-and-training-data-tp4325530p4327028.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list