[R] How to subset my data and at the same time keep the balance?

arun smartpink111 at yahoo.com
Mon Nov 19 18:31:30 CET 2012


HI,
May be this helps:
dat1<-read.table(text="
  V1 V2
1 5 10
2 6  3
3 8  4
4 9 20
5 15 30
6 25 40
7 2  4
8 3  1
9 1  5
10 8 10
",header=TRUE)
dat2<-dat1[sample(NROW(dat1),NROW(dat1)*(1-0.3)),] #70% of data
dat2$newcol<-TRUE
 dat1$newcol1<-TRUE
 dat4<-merge(dat1,dat2,by=c("V1","V2"),all=TRUE)
 dat5<-dat4[is.na(dat4$newcol),][,1:2]  #remaining 30%
 dat5
#  V1 V2
#2  2  4
#4  5 10
#8  9 20
A.K.



----- Original Message -----
From: Eddie Smith <eddieatr at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, November 19, 2012 12:16 PM
Subject: [R] How to subset my data and at the same time keep the balance?

Hi guys,

I have 1000 rows of a dataset. In my analysis, I need 70% of the data,
run my analysis and then use the remaining 30% to test my model.

Could anybody kindly help me on this?

Cheers

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list