[R] Match beginning and end of string (grepl)
john.archie.mckown at gmail.com
Tue Sep 2 14:34:02 CEST 2014
On Tue, Sep 2, 2014 at 7:12 AM, Johannes Radinger
<johannesradinger at gmail.com> wrote:
> I'd like to match the beginning and the end of a string. E.g. I want to
> extract all strings from a vector that beginn with "12" and end with
> a <- "2 green Apples"
> b <- "12 green Apples"
> c <- "12 Apples and 2 green Bananas"
> d <- "12 yellow Bananas"
> fruitlist <- c(a,b,c,d)
> # This is how to extract all that beginn with 12
> But how can I get only those that also end with "Apples". So basically
> just item b "12 green Apples" should remain.
> Is there any clear description and examples of regular expressions
> and how to use them? I find the manual ?grepl very difficult to read.
Please try to change your email to not use HTML, per forum requirements.
Now, on to some real help. Regular expressions are the most
complicated thing that I've ever run across except, maybe, for APL.
For me, the definitive book on them is "Mastering Regular Expressions"
by Jeffrey E. F. Friedl
$7.49 for the Kindle version. $32.19 for the "dead tree" (paperback)
version. If you go to that Amazon page, it has some other
possibilities as well.
However a very nice, free, web tutorial is available at:
To answer your immediate question, given your very good start,
The dollar sign says "match the logical end of the string" and is the
key you are looking for. In English, the regex says: Match the front
of the string (^). Now, immediately match the characters "12". The .
means match anything. Followed by the * which means "match the
previous expression (anything) 0 or more times". Then match the string
"Apples". Then match the end of the string ($). You don't want me to
explain how this is done. Talk about confusing to the novice. And
likely even to people who can use regular expressions farily well
(such as myself).
There is nothing more pleasant than traveling and meeting new people!
More information about the R-help