[R] Word boundaries and gregexpr in R 2.2.1

Gabor Grothendieck ggrothendieck at gmail.com
Wed Feb 1 02:26:22 CET 2006


When I tried it on Windows XP there was a grinding sound, probably memory
being swapped and it just seemed to go on forever and I finally had to kill R.
I am using "R version 2.2.1, 2005-12-20".   What did seem to work was this:

 gregexpr("X", gsub("\\b\\w|\\w\\b", "X", text))

where "X" should be replaced with some character not in the text.


On 1/31/06, Stefan Th. Gries <stgries_lists at arcor.de> wrote:
> Hi
>
> I have a question concerning how to match word boundaries which I bet has a very simple answer, but I haven't found it with trial and error nor by searching the help archives for the terms in the subject line. The problem is this: I have a vector of two character strings.
>
> text<-c("This is a first example sentence.", "And this is a second example      sentence.")
>
> If I now look for word boundaries with regexpr, this is what I get:
> > regexpr("\\b", text, perl=TRUE)
> [1] 1 1
> attr(,"match.length")
> [1] 0 0
>
> So far, so good. But with gregexpr I get:
>
> > gregexpr("\\b", text, perl=TRUE)
> Error: cannot allocate vector of size 524288 Kb
> In addition: Warning messages:
> 1: Reached total allocation of 1015Mb: see help(memory.size)
> 2: Reached total allocation of 1015Mb: see help(memory.size)
>
> Why don't I get the locations and extensions of all word boundaries?
>
> I am using R 2.2.1 on a machine running Windows XP:
> > R.version
>         _
> platform i386-pc-mingw32
> arch     i386
> os       mingw32
> system   i386, mingw32
> status
> major    2
> minor    2.1
> year     2005
> month    12
> day      20
> svn rev  36812
> language R
>
> Thanks a lot,
> STG
> --
> Stefan Th. Gries
> ----------------------------------------
> University of California, Santa Barbara
> http://people.freenet.de/Stefan_Th_Gries
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list