[R] toupper does not work in sub + regex

Tan, Richard RTan at panagora.com
Mon Apr 13 19:22:35 CEST 2009


Thanks, Bill!  One more question, how do I get SviRaw, i.e., just
uppercase the 1st char and keep everything else the same?  

sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 \\2", "q_sviRaw",perl=TRUE)

Did not work. 

Thank you!
Richard

-----Original Message-----
From: William Dunlap [mailto:wdunlap at tibco.com] 
Sent: Monday, April 13, 2009 1:17 PM
To: Tan, Richard; r-help at r-project.org
Subject: Re: [R] toupper does not work in sub + regex

You could also use \\U and \\L in the replacement with perl=TRUE.  \\U
"converts the rest of the replacement to upper case" and \\L converts to
lowercase. (By "replacement" it means the parts of the replacement that
arise from parenthesized subpatterns in the pattern argument, not the
replacement argument itself.)  E.g.,

> sub("q_([a-z])[a-zA-Z]*", "\\U\\1\\L", "q_sviRaw", perl=TRUE)
[1] "S"
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\L\\2", "q_sviRaw",
perl=TRUE)
[1] "S then viraw"
> sub("q_([a-z])([a-zA-Z]*)", "\\U\\1 then \\2", "q_sviRaw", perl=TRUE)
[1] "S then VIRAW"

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

----------------------------------------------------------------------
[R] toupper does not work in sub + regex

Gabor Grothendieck ggrothendieck at gmail.com Mon Apr 13 18:26:12 CEST
2009

sub only handles replacement strings, not replacement functions.
Your code is the same as:

sub("q_([a-z])[a-zA-Z]*", '\\1', "q_sviRaw")

since toupper('\\1') has no alphabetics so its just literally '\\1' and
the latter is what sub uses.

The gsubfn function in the gsubfn package can deal with replacement
functions:

> library(gsubfn)
> gsubfn("q_([a-z])[a-zA-Z]*", toupper, "q_sviRaw")
[1] "S"

See the home page: http;//gsubfn.googlecode.com, vignette and help page.

On Mon, Apr 13, 2009 at 11:54 AM, Tan, Richard <RTan at panagora.com>
wrote:
> Hi, I don't know what I am doing wrong to the toupper does not seem 
> working in sub + regex.  The following returns 's' not the upper class

> 'S' as I expect:
>
> sub("q_([a-z])[a-zA-Z]*",toupper('\\1'),"q_sviRaw")
>
> Can someone tell me where I did wrong?
>
> Thanks,
> Richard




More information about the R-help mailing list