[R] A calculation in data.frame
Duncan Murdoch
murdoch.duncan at gmail.com
Tue Jan 7 22:58:27 CET 2014
On 14-01-07 3:21 PM, Ron Michael wrote:
> Hi,
>
> I have to perform some formula driven calculation in a data.frame (as defined below). Let say I have following DF:
>
>> DF <- data.frame(A1 = c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'), A2 = c('m', 'n', 'p', 'm', 'n', 'p', 'm', 'n', 'p'), A3 = c(1,2,3,4,5,6,7,8,9))
>> DF
> A1 A2 A3
> 1 a m 1
> 2 a n 2
> 3 a p 3
> 4 b m 4
> 5 b n 5
> 6 b p 6
> 7 c m 7
> 8 c n 8
> 9 c p 9
>
>
> Now let say, user gives one formula which will be applied on the elements of A1 column. Let say the formula looks like:
>
> z = a + 2*b + c (infact the formula will be arbitrary like z = f(a, b, c))
>
> Once such formula is given, the result will be like (for the columns A1, A2, A3 respectively)
>
> z m 16
> z n 20
> z p 24
>
> the last column comes from the fact that 1 + 2*4 + 7 = 16, 2 + 2*5 + 8 = 20, 3 + 2*6 + 9 = 24
>
> Given that the formula wil be user defined, and to be applied on some data.frame like DF, I am seeking some automated way to accomplice the task for really big DF of previous kind and fairly complex formula.
>
> Can somebody suggest me for efficient way to perform this task in R?
A dataframe isn't really the best structure for this problem. What you
really have in R terms are three environments, indexed by A2, each
containing bindings to a, b and c. Within each of those environments
you want to create a new binding to z, according to the user-supplied
formula.
The way I'd implement that pretty much matches my description. Have a
named list of environments, write a function to evaluate the formula and
assign the value, then just lapply it to your list.
If you really do want things in the dataframe format, then write
functions to convert to it at the beginning, and from it at the very
end. Don't work with that format if efficiency matters to you.
Duncan Murdoch
More information about the R-help
mailing list