[R] concise syntax for selecting multiple rows

David Winsemius dwinsemius at comcast.net
Mon Apr 26 17:46:19 CEST 2010


On Apr 26, 2010, at 11:27 AM, Gabor Grothendieck wrote:

> Here are four ways:
>
> # state.name comes with R
> DF <- data.frame(state = state.name, num = seq_along(state.name))
>
> DF[DF$state %in% c("Iowa", "Utah"),]
>
> subset(DF, state %in% c("Iowa", "Utah"))
>
> subset(DF, grepl("Iowa|Utah", state))

John;

The grepl implementation offers benefits that the %in% versions do  
not, e.g. partial matches:

  subset(DF, grepl("New+", state))
            state num
29 New Hampshire  29
30    New Jersey  30
31    New Mexico  31
32      New York  3

Or equivalently and probably faster:

 > DF[grepl("New+", DF$state), ]

This gives the state.names with any spaces:
 > DF[grepl(". +", DF$state), ]
-- 
David.
>
> library(sqldf) # see http://sqldf.googlecode.com
> sqldf("select * from DF where state in ('Iowa', 'Utah')")
>
>
> On Mon, Apr 26, 2010 at 11:12 AM, John Sorkin
> <jsorkin at grecc.umaryland.edu> wrote:
>> I would like to select rows if a row contains any one of several  
>> values. I can do the selection as follows:
>>
>> result[,"Subject"]=="JEFF" | result[,"Subject"]=="BG"
>>
>> But this is very unwieldily if one wishes to select many, many rows  
>> as one has to continuously repeat the source:
>>
>> result[,"Subject"]=="JEFF" | result[,"Subject"]=="BG" |  
>> result[,"Subject"]=="John"  |  result[,"Subject"]=="Mary"
>>
>> Is there an easier way? I tried the following but it did not work:
>>
>>
>> result[,"Subject"]==c("JEFF" | "BG" | "John"  | "Mary")
>>
>> Thanks,
>> John
>>
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for th... 
>> {{dropped:6}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list