[BioC] splitting a data frame

James W. MacDonald jmacdon at med.umich.edu
Thu Apr 26 21:24:30 CEST 2007


Hi David,

kfbargad at ehu.es wrote:
> Dear list,
> 
> sorry for this very basic question but I have searched the R manual 
> without success. What function do I need to randomly split a 
> data.frame into two data.frames (for example a data.frame containing 
> 200 rows and 5 columns into two of say 150 and 50 rows each)? Should I 
> first create a vector of 150 random numbers between 1 and 200 (but 
> how?) and then use subset() function?

Depends on exactly what you want to end up with. To me, splitting 
implies selecting all rows above row N and all rows below row N-1.

N <- sample(1:200, 1)
sub1 <- df[1:(N-1),]
sub2 <- df[N:200,]

or maybe you want to randomly subset rows?

idx <- sample(1:200, 150)
sub1 <- df[idx,]
sub2 <- df[!1:200 %in% idx,]

Anyway, time taken to read 'An Introduction to R' would be well spent.

Best,

Jim


> 
> Thanks in advance for your help
> David
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list