[R] find numbers in a line with letters

Bert Gunter gunter.berton at gene.com
Thu Aug 27 01:19:25 CEST 2009


Re-using Gabor's suggestion from yesterday, I think the regex incantation

gsub("[^[:digit:].]+"," ",x)

also will do, where x is a your vector of strings. It says to replace runs
of everything but digits and . with a single space.

> x
[1] "this Item costs 3.32 Dollars or maybe 10.00 cents"
> gsub("[^[:digit:].]+"," ",x)
[1] " 3.32 10.00 " 

You can then "pipe" this through a textConnection to convert it to numeric:

> scan(textConnection(gsub("[^[:digit:].]+"," ",x)))
Read 2 items
[1]  3.32 10.00

Bert Gunter
Genentech Nonclinical Biostatisics

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Steve Lianoglou
Sent: Wednesday, August 26, 2009 3:49 PM
To: Martin Batholdy
Cc: r-help at r-project.org
Subject: Re: [R] find numbers in a line with letters

Hi,

On Aug 26, 2009, at 6:38 PM, Martin Batholdy wrote:

> hi,
>
> is there an easy way to extract numbers from a string?
>
> for example I have;
> "this Item costs 3.32 Dollars"
>
> is there an easy way to extract the 3.32 as a number?

Regular expressions to the rescue?

Perhaps you'll need to fine tune it, but see here:

R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars", perl=T)
[[1]]
[1] 17
attr(,"match.length")
[1] 4

R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars, that  
item costs 10.12 dollars", perl=T)
[[1]]
[1] 17 47
attr(,"match.length")
[1] 4 5

R> gregexpr("(\\d+(\\.\\d+)?)", "this Item costs 3.32 Dollars, that  
item costs 10 dollars even, ", perl=T)
[[1]]
[1] 17 47
attr(,"match.length")
[1] 4 2

R> gregexpr("(\\d+(\\.\\d+)?)", "this one is free ", perl=T)
[[1]]
[1] -1
attr(,"match.length")
[1] -1

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list