[R] subset based on column names and then subset based on the inverse (grep?, or...)

Tue Aug 3 13:53:09 CEST 2010

On Aug 3, 2010, at 7:41 AM, stephen sefick wrote:

> I would like to be able to grab x and y columns out of a dataframe and
> then grab all of the columns that are not equal to x or y.  I am sure
> that I am missing something easy.
>
>
> ftbr_UTM_downstream <- (structure(list(site =
> c("Jennie_Creek_Main_Stem", "Wolf_Pit_Creek_Main_Stem",
> "Little_Rockfish_Main_Stem_North", "Big_Muddy_Creek_Main_Stem",
> "Flat_Creek_Main_Stem", "little_river_tributary",  
> "Hector_Creek_Main_Stem",
> "Juniper_Creek_Main_Stem", "Field_Branch_Main_Stem",  
> "Gum_Branch_Main_Stem"
> ), base = c("ftbr", "ftbr", "ftbr", "ftbr", "ftbr", "ftbr", "ftbr",
> "ftbr", "ftbr", "ftbr"), creek = c("jcms", "wpms", "lrf1", "bmcm",
> "fcms", "lrtb", "hcms", "jpms", "fbms", "gbms"), date =  
> c("06/20/2010",
> "06/20/2010", "06/18/2010", "06/18/2010", "06/21/2010", "06/22/2010",
> "06/22/2010", "06/21/2010", "06/19/2010", "06/19/2010"), elevation_m  
> = c(101,
> 81, 59, 75, 73, 55, 55, 88, 77, 87), x = c(652159, 651646, 674147,
> 635466, 665726, 675295, 673098, 658917, 655613, 651748), y =  
> c(3887647,
> 3886986, 3893724, 3876272, 3893886, 3895529, 3895076, 3882474,
> 3881587, 3884249), station = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
>    notes_ = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names =  
> c("site",
> "base", "creek", "date", "elevation_m", "x", "y", "station",
> "notes_"), row.names = c("1", "3", "5", "7", "9", "11", "13",
> "15", "17", "19"), class = "data.frame"))
>
> #this doesn't work, but I would like it to.  I also tried grep to no  
> avail
>
> colnames(ftbr_UTM_downstream)=="x" | "y"

That would not parse properly because the expression on the rhs of  
"|", namely  "y", is not logical, but rather character.
>
Try instead:

ftbr_UTM_downstream[ , grep("^x$|^y$", colnames(ftbr_UTM_downstream))]
ftbr_UTM_downstream[ , -grep("^x$|^y$", colnames(ftbr_UTM_downstream))]

  or use the select argument to subset

subset(ftbr_UTM_downstream, select = colnames(ftbr_UTM_downstream) %in 
% c("x", "y") )
?"%in%"
subset(ftbr_UTM_downstream, select = colnames(ftbr_UTM_downstream)[
                                   !colnames(ftbr_UTM_downstream) %in%  
c("x", "y") ] )

>
> -- 
> Stephen Sefick
> ____________________________________
> | Auburn University                                   |
> | Department of Biological Sciences           |
> | 331 Funchess Hall                                  |
> | Auburn, Alabama                                   |
> | 36849                                                    |
> |___________________________________|
> | sas0025 at auburn.edu                             |
> | http://www.auburn.edu/~sas0025             |
> |___________________________________|
>
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods.  We are mammals, and have not exhausted the
> annoying little problems of being mammals.
>
>                                 -K. Mullis
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT