[R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Greg Snow 538280 at gmail.com
Mon Apr 23 23:05:39 CEST 2012


Here is a method that uses negative look behind:

> tmp <- c('mutation','nonmutated','unmutated','verymutated','other')
> grep("(?<!un)(?<!non)muta", tmp, perl=TRUE)
[1] 1 4

it looks for muta that is not immediatly preceeded by un or non (but
it would match "unusually mutated" since the un is not immediatly
befor the muta).

Hope this helps,

On Mon, Apr 23, 2012 at 10:10 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:
> Hello All,
>
> Started out awhile ago trying to select columns in a dataframe whose names contain some variation of the word "mutant" using code like:
>
> names(KRASyn)[grep("muta", names(KRASyn))]
>
> The idea then would be to add together the various columns using code like:
>
> KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta", names(KRASyn))])
>
> What I discovered though, is that this selects columns like "nonmutated" and "unmutated" as well as columns like "mutated", "mutation", and "mutational".
>
> So I'd like to know how to select columns that have some variation of the word "mutant" without the "non" or the "un". I've been looking around for an example of how to do that but haven't found anything yet.
>
> Can anyone show me how to select the columns I need?
>
> Thanks,
>
> Paul
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com



More information about the R-help mailing list