[R] grup function

Bert Gunter gunter.berton at gene.com
Wed Apr 10 22:01:03 CEST 2013


Below.

On Wed, Apr 10, 2013 at 3:58 AM, Dominic Roye <dominic.roye at gmail.com> wrote:
> Hello,
>
> How can i mattch blanks within words, when i have more phrases?
>
> c("Shangh i", "Hello here i am","h llo")
>
>
>> gsub(" ","a",c("Shangh i", "Hello here i am","h llo"))
> [1] "Shanghai"        "Helloahereaiaam" "hallo"
>
>
> I would like to have [1] "Shanghai"  "Hello here i am" "hallo"
>
> I hope someone can help me.

Doubt if anyone can.

To a parser, "Shangh i" is just as much two valid "words" as "I am"  of "am I".

So either you have to give the parser (the regex engine)  more rules
(patterns)  to distinguish "words"  from non-words or have a
dictionary handy.

>From your example, saying that a blank following or preceding a word
consisting of a single letter should be replaced might not produce too
high a false positive rate, but it will obviously miss on I and a. But
maybe your real situation doesn't conform to this.

(I would be fascinated to be shown how I am wrong about this if I am!)

-- Bert



>
>
> Thanks,
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list