[R] parsing numeric values

Gabor Grothendieck ggrothendieck at gmail.com
Wed Nov 18 14:04:42 CET 2009


A minor variant might be the following:

   library(gsubfn)
   strapply(input, "\\d+\\.\\d+E[-+]?\\d+", as.numeric, simplify = rbind)

where:

- as.numeric is used in place of c in which case we do not need combine
- \\d+ matches one or more digits
- \\. matches a decimal point
- [-+]? matches -, + or nothing (i.e. an optional sign).
- parentheses around the regular expression not needed

On Wed, Nov 18, 2009 at 7:28 AM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:
> Try this:
>
> strapply(input, "([0-9]+\\.[0-9]+E-[0-9]+)", c, simplify = rbind,
> combine = as.numeric)
>
> On Wed, Nov 18, 2009 at 9:57 AM, baptiste auguie
> <baptiste.auguie at googlemail.com> wrote:
>> Dear list,
>>
>> I'm seeking advice to extract some numeric values from a log file
>> created by an external program. Consider the following example,
>>
>> input <-
>> readLines(textConnection(
>> "some text
>>  <ax> =    1.3770E-03     <bx> =    3.4644E-07
>>  <ay> =    1.9412E-04     <by> =    4.8840E-08
>>
>> other text
>>  <aax>  =    1.3770E-03     <bbx> =    3.4644E-07
>>  <aay>  =    1.9412E-04     <bby> =    4.8840E-08"))
>>
>> ## this is what I want
>> results <- c(as.numeric(strsplit(grep("<ax>", input,val=T), " ")[[1]][8]),
>>             as.numeric(strsplit(grep("<ay>", input,val=T), " ")[[1]][8]),
>>             as.numeric(strsplit(grep("<aax>", input,val=T), " ")[[1]][9]),
>>             as.numeric(strsplit(grep("<aay>", input,val=T), " ")[[1]][9])
>>             )
>>
>> ## [1] 0.00137700 0.00019412 0.00137700 0.00019412
>>
>> The use of strsplit is not ideal here as there is a different number
>> of space characters in the lines containing <ax> and <aax> for
>> instance (hence the indices 8 and 9 respectively).
>>
>> I tried to use gsubfn for a cleaner construct,
>>
>> strapply(input, "<ax> += +([0-9.]+)", c, simplify=rbind,combine=as.numeric)
>>
>> but I can't seem to find the correct regular expression to deal with
>> the exponent.
>>
>>
>> Any tips are welcome!
>>
>>
>> Best regards,
>>
>> baptiste
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list