[R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"

Bert Gunter gunter.berton at gene.com
Mon Apr 23 19:15:50 CEST 2012


But maybe ... (see below)
-- Bert

On Mon, Apr 23, 2012 at 9:25 AM, Paul Miller <pjmiller_57 at yahoo.com> wrote:
> Hello Dr. Winsemius,
>
> Unfortunately, I also have terms like "krasmutated". So simply selecting words that start with "muta" won't work in this case.
>
> Thanks,
>
> Paul
>
>
> --- On Mon, 4/23/12, David Winsemius <dwinsemius at comcast.net> wrote:
>
>> From: David Winsemius <dwinsemius at comcast.net>
>> Subject: Re: [R] Selecting columns whose names contain "mutated" except when they also contain "non" or "un"
>> To: "Paul Miller" <pjmiller_57 at yahoo.com>
>> Cc: r-help at r-project.org
>> Received: Monday, April 23, 2012, 11:16 AM
>>
>> On Apr 23, 2012, at 12:10 PM, Paul Miller wrote:
>>
>> > Hello All,
>> >
>> > Started out awhile ago trying to select columns in a
>> dataframe whose names contain some variation of the word
>> "mutant" using code like:
>> >
>> > names(KRASyn)[grep("muta", names(KRASyn))]
>> >
>> > The idea then would be to add together the various
>> columns using code like:
>> >
>> > KRASyn$Mutant_comb <- rowSums(KRASyn[grep("muta",
>> names(KRASyn))])
>> >
>> > What I discovered though, is that this selects columns
>> like "nonmutated" and "unmutated" as well as columns like
>> "mutated", "mutation", and "mutational".
>> >
>> > So I'd like to know how to select columns that have
>> some variation of the word "mutant" without the "non" or the
>> "un". I've been looking around for an example of how to do
>> that but haven't found anything yet.

If this **is** a complete specification then wouldn't simply:

x <- names(yourdataframe)
 grepl("muta",x) & !grepl("nonmuta|unmuta",x)

do it?

e.g.
> x <- c("nonmutated","unmutated","mutation","mutated","krasmutated")
> grepl("muta",x) & !grepl("nonmuta|unmuta",x)
[1] FALSE FALSE  TRUE  TRUE  TRUE

>> >
>> > Can anyone show me how to select the columns I need?
>>
>> If you want only columns whose names _begin_ with "muta"
>> then add the "^" character at the beginning of your
>> pattern:
>>
>> names(KRASyn)[grep("^muta", names(KRASyn))]
>>
>> (This should be explained on the ?regex page.)
>>
>> --
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list