[R] using foumula to calculate a column in dataframe

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Tue Jul 29 00:13:46 CEST 2014


On Mon, 28 Jul 2014, Pavneet Arora wrote:

> Hello All,
> I need to calculate a column (Vupper) using a formula, but I am not sure
> how to. It will be easier to explain with an example.
>
> Again this is my dataset:
> dput(nd)
> structure(list(week = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
> 29, 30), value = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
> 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
> 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
> 11.31, 10.52), cusum = c(-0.550000000000001, -2.56, -3.27, -1.61,
> 0.549999999999999, 0.729999999999999, -1.23, 0.229999999999999,
> -0.570000000000002, -0.230000000000002, -1.2, 0.269999999999998,
> 0.779999999999998, 0.179999999999998, 0.259999999999998,
> -0.370000000000003,
> 0.249999999999996, 0.559999999999997, 0.559999999999997, 3.56,
> 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09
> )), .Names = c("week", "value", "cusum"), row.names = c(NA, -30L
> ), class = "data.frame")
>
> I have some constants in my data. These are:
> sigma =1, h = 5, k = 0.5
>
> The formula requires me to start from the bottom row (30th in this case).
> The formula for the last row will be row 30th Cusi value (13.09) + h(5) *
> sigma(1) = giving me the value of 18.1
>
> Then the formula for the 29th row for Vupper uses the value of 30th Vupper
> (18.1) + k(0.5) * sigma(1) = giving me the value of 18.6
>
> Similarly the formula for the 28th row for Vupper will use value of 29th
> Vupper(18.6) + k(0.5) * sigma(1) = giving me the value of 19.1
>
> And so on?.

This is a recurrence formula... each value depends on the previous value 
in the sequence. In general these can be computationally expensive in R, 
but there are certain very common cases that have built-in functions with 
which you can build many of the real-world cases you might encounter 
(such as this one).

>
> Also, is there any way to make the formula generalised using loop or
> functions? Because I really don?t want to have to re-write the program if
> my number of rows increase or decrease or if I use another dataset?
>
> So far my function looks like following (Without the Vupper formula in
> there):
> vmask2 <- function(data,target,sigma,h,k){
>  data$deviation <- data$value - target
>  data$cusums <- cumsum(data$deviation)
>  data$ma <- c(NA,abs(diff(data$value)))
>  data$Vupper <- *not sure what to put here*
>
>  data
> }

I avoid using the variable name "data" because there is a base function of 
that name.

sigma <- 1
h <- 5
k <- 0.5

dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
                             , rep( 0, nrow(dta) - 1 )
                             )
                             + seq( 0, by=k * sigma, length.out=30L )
                          )
                  )

Note how the terms in your algorithm are re-grouped into vectors that c 
and seq and rep can generate, and cumsum is used to implement the 
recurrence, and the rev function is used to reverse the vector.
If you are going to apply this to long sequences of data, you might want 
to fix the accumulation of floating-point error in the seq call by using 
integers:

dta$Vupper <- rev( cumsum( c( dta[ nrow(dta), "cusum" ] + h * sigma
                             , rep( 0, nrow(dta) - 1 )
                             )
                             + k * sigma * seq( 0L, by=1L, length.out=30L )
                          )
                  )

> ***********************************************************************************************************************************************************************************************************************
> MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No. 93792). Registered in England and Wales at St. Mark???s Court, Chart Way, Horsham, West Sussex, RH12 1XL.
>
> Authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority and the Prudential Regulation Authority.
> ************************************************************************************************************************************************************************************************************************
>
> 	[[alternative HTML version deleted]]

Please send your emails in plain text, as the Posting Guide requests. HTML 
often corrupts what you send to the list.

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list