[R] Differenciate numbers from reference for rows

Gabor Grothendieck ggrothendieck at gmail.com
Sat Oct 30 14:42:01 CEST 2010


On Fri, Oct 29, 2010 at 6:54 PM, M.Ribeiro <mresendeufv at yahoo.com.br> wrote:
>
> So, I am having a tricky reference file to extract information from.
>
> The format of the file is
>
> x   1 + 4 * 3 + 5 + 6 + 11 * 0.5
>
> So, the elements that are not being multiplied (1, 5 and 6) and the elements
> before the multiplication sign (4 and 11) means actually the reference for
> the row in a matrix where I need to extract the element from.
>
> The numbers after the multiplication sign are regular numbers
> Ex:
>
>> x<-matrix(20:35)
>> x
>      [,1]
>  [1,]   20
>  [2,]   21
>  [3,]   22
>  [4,]   23
>  [5,]   24
>  [6,]   25
>  [7,]   26
>  [8,]   27
>  [9,]   28
> [10,]   29
> [11,]   30
> [12,]   31
> [13,]   32
> [14,]   33
> [15,]   34
> [16,]   35
>
> I would like to read the rows 1,4,5,6 and 11 and sum then. However the
> numbers in the elements row 4 and 11 are multiplied by 3 and 0.5
>
> So it would be
> 20 + 23 * 3 + 24 + 25 + 30 * 0.5.
>
> And I have this format in different files so I can't do all by hand.
> Can anybody help me with a script that can differentiate this?


I assume that every number except for the second number in the pattern
number * number is to be replaced by that row number in x.  Try this.
We define a regular expression which matches the first number ([0-9]+)
of each potential pair and optionally (?) spaces ( *) a star (\\*),
more spaces ( *) and digits [0-9.]+ passing the first and second
backreferences (matches to the parenthesized portions of the regular
expression) to f and inserting the output of f where the matches had
been.

library(gsubfn)
f <- function(a, b) paste(x[as.numeric(a)], b)
s2 <- gsubfn("([0-9]+)( *\\* *[0-9.]+)?", f, s)

If the objective is to then perform the calculation that that
represents then try this:
sapply(s2, function(x) eval(parse(text = x)))

For example,

> s <- c("1 + 4 * 3 + 5 + 6 + 11 * 0.5", "1 + 4 * 3 + 5 + 6 + 11 * 0.5")
> x <- matrix(20:35)
> f <- function(a, b) paste(x[as.numeric(a)], b)
> s2 <- gsubfn("([0-9]+)( *\\* *[0-9.]+)?", f, s)
> s2
[1] "20  + 23  * 3 + 24  + 25  + 30  * 0.5" "20  + 23  * 3 + 24  + 25
+ 30  * 0.5"
> sapply(s2, function(x) eval(parse(text = x)))
20  + 23  * 3 + 24  + 25  + 30  * 0.5 20  + 23  * 3 + 24  + 25  + 30  * 0.5
                                  153                                   153

For more see the gsubfn home page at http://gsubfn.googlecode.com


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list