[R] help with regexpr in gsub

Hong Ooi Hong.Ooi at iag.com.au
Thu Jan 18 01:44:33 CET 2007


substr is vectorised, so it should work fine without needing an explicit

Hong Ooi
Senior Research Analyst, IAG Limited
388 George St, Sydney NSW 2000
+61 (2) 9292 1566
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Kimpel, Mark
Sent: Thursday, 18 January 2007 11:27 AM
To: r-help at stat.math.ethz.ch
Subject: [R] help with regexpr in gsub

I have a very long vector of character strings of the format
"GO:0008104.ISS" and need to strip off the dot and anything that follows
it. There are always 10 characters before the dot. The actual characters
and the number of them after the dot is variable.

So, I would like to return in the format "GO:0008104" . I could do this
with substr and loop over the entire vector, but I thought there might
be a more elegant (and faster) way to do this.

I have tried gsub using regular expressions without success. The code 

gsub(pattern= "\.*?" , replacement="", x=character.vector)

correctly locates the positions in the vector that contain the dot, but
replaces all of the strings with "". Obviously not what I want. Is there
a regular expression for replacement that would accomplish what I want?

Or, does R have a better way to do this?



Mark W. Kimpel MD 


(317) 490-5129 Work, & Mobile


(317) 663-0513 Home (no voice mail please)

1-(317)-536-2730 FAX

R-help at stat.math.ethz.ch mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.


The information transmitted in this message and its attachme...{{dropped}}

More information about the R-help mailing list