[R] regular expressions : extracting numbers

Kuhn, Max Max.Kuhn at pfizer.com
Mon Jul 30 15:07:12 CEST 2007


This might work:

> numOnly <- function(x) gsub("[^0-9]", "", x)
> numOnly("lema, rb 2%")
[1] "2"
> numOnly("rb")
[1] ""

Max

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of GOUACHE David
Sent: Monday, July 30, 2007 7:59 AM
To: r-help at stat.math.ethz.ch
Subject: [R] regular expressions : extracting numbers

Hello all,

I have a vector of character strings, in which I have letters, numbers, and symbols. What I wish to do is obtain a vector of the same length with just the numbers.
A quick example -

extract of the original vector :
"lema, rb 2%" "rb 2%" "rb 3%" "rb 4%" "rb 3%" "rb 2%,mineuse" "rb" "rb" "rb 12" "rb" "rj 30%" "rb" "rb" "rb 25%" "rb" "rb" "rb" "rj, rb"

and the type of thing I wish to end up with :
"2" "2" "3" "4" "3" "2" "" "" "12" "" "30" "" "" "25" "" "" "" ""

or, instead of "", NA would be acceptable (actually it would almost be better for me)

Anyways, I've been battling with gsub() and things of the sort, but I'm drowning in the regular expressions, despite a few hours of looking at Perl tutorials...
So if anyone can help me out, it would be greatly appreciated!!

In advance, thanks very much.

David Gouache
Arvalis - Institut du Végétal
Station de La Minière
78280 Guyancourt
Tel: 01.30.12.96.22 / Port: 06.86.08.94.32

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

----------------------------------------------------------------------
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}



More information about the R-help mailing list