[R] grep won't work finding one column

John McKown john.archie.mckown at gmail.com
Tue Oct 14 17:06:42 CEST 2014


AT and at are not the same. If you want an case insensitive compare
for the characters "at" you need the "ignore.case=TRUE" added. E.g.:

df[,grep(".at",colnames(df),ignore.case=TRUE)

That should match the column name you gave. Which does not match your
initial description which said "ending with .at". That has an embedded
AT. So I am still a bit confused about your needs.

On Tue, Oct 14, 2014 at 9:55 AM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
> For example,
>
> DF will usually have numerous columns with sample1.at sample1.dp
> sample1.fg sample2.at sample2.dp sample2.fg and so on....
>
> I'm running this code in R as part of a shell script which runs over
> several different file sizes so sometimes it will come across a file
> with one sample in it: i.e. sample1: when the R code runs through this
> file... trying to grep out  the "sample1.at" column does not work and
> it will halt and stop.
>
> Here is some sample data... say I want to get out the AT_ only column....
>
>
> Sample_1 AT_1
> A/A RR
> G/G AA
> T/T AA
> G/A RA
> G/G RR
> C/C AA
> C/C AA
> C/T RA
> A/A AA
> T/G RA
>
> it will have a problem grepping out this single column.
>
> On Tue, Oct 14, 2014 at 10:38 AM, John McKown
> <john.archie.mckown at gmail.com> wrote:
>> On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
>>> I'm having an issue with grep:
>>>
>>> I have numerous columns that end with .at... when I use grep like so:
>>>
>>> df[,grep(".at",colnames(df))]
>>>
>>> it works fine.  When I have one column that ends with .at, it does not
>>> work.  Why is that?  As this is loop with varying number of columns
>>> ending in .at I would like some code that would work with 1 to n
>>> number of columns.
>>>
>>> Is there something more optimal than grep?
>>>
>>> Thanks!
>>
>> I can't answer your direct question. But do you realize that your code
>> does not match your words? The grep show does not _only_ match columns
>> who name end with the characters '.at'. It matches all column names
>> which contain any character followed by the characters "at". To do the
>> match with only columns whose names end with the characters ".at", you
>> need: grep("\.at$",colnames(df)).
>>
>> You might want to post an example which fails. Just to be complete, be
>> sure to use the dput() function so that it is easy for members of the
>> group to cut'n'paste to get your data into our own R workspace.
>>
>> --
>> There is nothing more pleasant than traveling and meeting new people!
>> Genghis Khan
>>
>> Maranatha! <><
>> John McKown



-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! <><
John McKown



More information about the R-help mailing list