[R] problem with pattern matching

xavier.chardon at free.fr xavier.chardon at free.fr
Wed Aug 5 18:40:41 CEST 2009


Hi,

I don't think grep can handle a vector of patterns.

> grep( c("foo1", "foo2"), c("fffoo5", "fffoo6", "fffoo2", "fffoo1"))
[1] 4

This call is equivalent to:
grep( "foo1", c("fffoo5", "fffoo6", "fffoo2", "fffoo1") )


Maybe you could use the plyr package. I am only speculating, but something like this might work:

ddply( list, .(ID), function(x) dataframe[ grep(x$ID[[1]], dataframe$ID) , ] )

ddply splits "list" by ID in smaller dataframes. Assuming each ID is unique in list, you have dataframes of 1 line ("x" in the code line). So you take the ID and grep for it in dataframe. Then you return the corresponding line of dataframe (assuming there is always 1 and only 1 line or it might fail, not sure)

Maybe someone can come up with a more efficient way of doing it. The whole trick is to use grep with a vector of patterns.

Xavier

----- Mail Original -----
De: "Don MacQueen" <macq at llnl.gov>
À: "Rnewbie" <xuancj at yahoo.com>, r-help at r-project.org
Envoyé: Mercredi 5 Août 2009 16h49:58 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
Objet: Re: [R] problem with pattern matching

Perhaps
   intersect()
or
   merge()
will help. But, like others, I find it difficult to understand 
exactly what you want. I'd suggest providing a short example with 
actual ID values.

-Don

At 2:36 AM -0700 8/5/09, Rnewbie wrote:
>I wanted to extract my interested rows from a dataframe. I used:
>
>grep(list$ID, dataframe$ID, value=T) #list contains a list of my interested
>IDs
>
>I got one match in return, which is the very first ID in list. It seems the
>matching process just stopped, once the first match was found.
>
>
>
>David Winsemius wrote:
>>
>>
>>  On Aug 4, 2009, at 11:16 AM, Rnewbie wrote:
>>
>>>
>>>  dear all,
>>>
>>>  I got a problem with pattern matching using grep. I extracted a list 
>>>  of
>>>  characters from a data frame, and I tried to match this list of 
>>>  characters
>>>  to a column from another data frame. In return, I got only one 
>>>  match, but
>>>  there should be far more matches. Any ideas what has gone wrong?
>>
>>  In general this falls into the category of  a request to "read my 
>>  mind". One, out of probably an infinite number, of ways to get such a 
>>  result is to use if()  when you needed ifelse().
>>
>>>
>>>  Another question, if I also want to match the whole of the elements 
>>>  against
>>>  the non-initial parts of the elements in another table. Which 
>>>  command should
>>>  I use?
>>
>>  Cannot even assign a semantic meaning to that one. What is are "non-
>>  initial parts of the elements of another table"?
>>
>>
>>  ******************************************************************
>>>   .... provide commented, minimal, self-contained, reproducible code.
>>  ******************************************************************
>>>
>>>  Thanks
>>
>>  David Winsemius, MD
>>  Heritage Laboratories
>>  West Hartford, CT
>>
>>  ______________________________________________
>>  R-help at r-project.org mailing list
>>  https://*stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide
>>  http://*www.*R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>--
>View this message in context: 
>http://*www.*nabble.com/problem-with-pattern-matching-tp24810298p24823683.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://*stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.


-- 
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list