[R] Splitting Large Data Frame into Two

arun smartpink111 at yahoo.com
Wed Jun 13 19:25:41 CEST 2012


Hi,

Try this:dat2<-read.table(text="
ID chrom loc.start  loc.end
    MA1B1Cy5  chr1 197008581 197026781
  MA1B3Cy5  chr1 197079541 197080381
  MA5B2Cy5  chr1 197088651 197118071
  MA5B2Cy5  chr1 197172341 197189641
 MA5B3Cy5  chr1 197008581 197010601
  MA5B4Cy5  chr1 197025421 197025701
  MA5B4Cy5  chr1 197145601 197159111
",sep="",header=TRUE)

dat3<-subset(dat2,ID=="MA1B1Cy5"|ID=="MA5B4Cy5")

dat3
        ID chrom loc.start   loc.end
1 MA1B1Cy5  chr1 197008581 197026781
6 MA5B4Cy5  chr1 197025421 197025701
7 MA5B4Cy5  chr1 197145601 197159111


A.K.







----- Original Message -----
From: Joshua Budman <josh.budman at gmail.com>
To: Sarah Goslee <sarah.goslee at gmail.com>
Cc: R-help at r-project.org
Sent: Wednesday, June 13, 2012 12:15 PM
Subject: Re: [R] Splitting Large Data Frame into Two

This is a sample of the data:
            ID chrom loc.start   loc.end
1    MA1B1Cy5  chr1 197008581 197026781
18   MA1B3Cy5  chr1 197079541 197080381
55   MA5B2Cy5  chr1 197088651 197118071
70   MA5B2Cy5  chr1 197172341 197189641
72   MA5B3Cy5  chr1 197008581 197010601
89   MA5B4Cy5  chr1 197025421 197025701
104  MA5B4Cy5  chr1 197145601 197159111

And I would like to put the rows which have, for instance, the "ID"  
MA1B1Cy5 and MA5B4Cy5 in one data frame and then the rest of the rows  
in another data frame. My decision is based on whether these samples  
are "tumor" or "non-tumor" samples which I determine manually based on  
another document. I hope this helps and thank you in advance!

On 13-Jun-12, at 12:02 PM, Sarah Goslee wrote:

> How are you deciding which values in the first column go into which  
> subset?
>
> If you have a vector containing those values, you could use %in% or if
> they're determined logically you could use that criterion.
>
> A reproducible example and a bit more information would get you more
> concrete answers.
>
> Sarah
>
> On Wed, Jun 13, 2012 at 11:06 AM, Joshua Budman  
> <josh.budman at gmail.com> wrote:
>> Hi
>> I have a large data frame of the form:
>>  a 1
>>  b 2
>>  c 3
>> And I would like to split this data frame into two separate data
>> frames based on the values in the first column, e.g.
>> a 1
>> b 2
>>        and
>> c 3
>>
>> Is there any way of doing this without having to write a different
>> "which" statement for each value in column 1 and then doing an  
>> "rbind"
>> at the end? I tried using an if/else statement using a lot of
>> "||" but that did not work well either. Help would be much  
>> appreciated.
>>
>> Thanks,
>> Josh
>
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org


    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list