[R] subseting a data frame

R. Michael Weylandt <michael.weylandt@gmail.com> michael.weylandt at gmail.com
Fri Mar 2 18:56:34 CET 2012


Please always cc the list for archival/threading reasons. 

Sort answer is that unique() gives the unique elements rather than something you should subset by, like a set of logical indices or row numbers. 

Note that in general unique(x) == x[!duplicated(x)] I'd imagine there are cases where this breaks down but I can't assemble one off the top of my head. 

Michael

On Mar 2, 2012, at 12:13 PM, nathalie <nac at sanger.ac.uk> wrote:

> thanks
> why unique doesn't work here??
>> I believe you want the duplicated() function.
>> 
>> Michael
>> 
>> On Mar 2, 2012, at 10:19 AM, nathalie<nac at sanger.ac.uk>  wrote:
>> 
>>> HI,
>>> this is my problem I want to subset this file df, using only  unique df$exon printing the line once even if  df$exon appear several times:
>>> 
>>> unique(df$exon) will show me the unique exons
>>> If I try to print only the unique exon lines
>>> with df[unique(df$exon),] -this doesn't print only the unique ones :(
>>> 
>>> could you help?
>>> thanks
>>> Nat
>>> 
>>> 
>>> 
>>> 
>>>                        exon size  chr     start       end
>>> 413077 ChrX_133594175_133594368_HPRT1  193 ChrX 133594175 133594368
>>> 413270 ChrX_133594183_133594368_HPRT1  185 ChrX 133594183 133594368
>>> 413455 ChrX_133594381_133594565_HPRT1  184 ChrX 133594381 133594565
>>> 413639 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
>>> 413745 ChrX_133607389_133607495_HPRT1  106 ChrX 133607389 133607495
>>> 413851 ChrX_133607404_133607495_HPRT1   91 ChrX 133607404 133607495
>>> 413942 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
>>> 414125 ChrX_133609211_133609394_HPRT1  183 ChrX 133609211 133609394
>>> 414308 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
>>> 414373 ChrX_133620495_133620560_HPRT1   65 ChrX 133620495 133620560
>>> 414438 ChrX_133620692_133620696_HPRT1    4 ChrX 133620692 133620696
>>> 414442 ChrX_133624218_133624235_HPRT1   17 ChrX 133624218 133624235
>>> 
>>> 
>>> 
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. 



More information about the R-help mailing list