[R] Filling NA with cumprod?

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Fri May 25 18:18:15 CEST 2012


This calls for a trick I have seen before on this list.  Once you 
understand it, you will be able to apply it to many similar problems.
The key is the "ave" function, which applies a function to various groups 
of values in a vector.

a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)

at <- ifelse( is.na(a), f, a )
lt <- cumsum( !is.na( a ) )
cbind( lt, at ) # see the pattern of levels that will control ave
ave( at, lt, FUN=cumprod )

or in one statement

ave( ifelse( is.na(a), f, a ), cumsum( !is.na( a ) ), FUN=cumprod )

When learning, the trickiest step is defining the vector of levels. 
Usually a cumsum of booleans that mark transitions is involved. Sometimes 
rev(test(rev(data)))) can be useful.

On Fri, 25 May 2012, David L Carlson wrote:

> This will loop only as many times as the largest number of consecutive NA's
> but uses vectorization within the loop. As currently defined, it will loop
> forever if the first value is NA.
>
> a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
> f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>
> a1 <- a
> alag <- c(NA, a1[1:length(a1)-1])
> # change NA to the value to use if the first value in a is NA
>
> while (sum(is.na(a1)) > 0) {
>  a1 <- ifelse(is.na(a1), f*alag, a1)
>  alag <- c(NA, a1[1:length(a1)-1])
> }
>
> ----------------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Igor Reznikovsky
>> Sent: Friday, May 25, 2012 9:08 AM
>> To: Petr Savicky
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Filling NA with cumprod?
>>
>> Hello Petr,
>>
>> Yes, I was hoping to avoid using loops.  If nothing else works, I will
>> take
>> approach as the last resort.
>>
>> Thank you,
>> Igor.
>> On May 25, 2012 2:26 AM, "Petr Savicky" <savicky at cs.cas.cz> wrote:
>>
>>> On Thu, May 24, 2012 at 08:24:38PM -0700, igorre25 wrote:
>>>> Hello,
>>>>
>>>> I need to build certain interpolation logic using R.
>> Unfortunately, I
>>> just
>>>> started using R, and I'm not familiar with lots of advanced or just
>>>> convenient features of the language to make this simpler.  So I
>> struggled
>>>> for few days and pretty much reduced the whole exercise  to the
>> following
>>>> problem, which I cannot resolve:
>>>>
>>>> Assume we have a vector of some values with NA:
>>>> a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
>>>>
>>>> and some coefficients as a vector of the same length:
>>>>
>>>> f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>>>>
>>>> I need to come up with function to get the following output
>>>>
>>>> o[1] = a[1]
>>>> o[2] = a[2]
>>>> o[3] = a[3]
>>>> o[4] = o[3]*[f3] # Because a[3] is NA
>>>> o[5] = o[4]*[f4] # Because a[4] is NA; This looks like recursive
>>>> calculations;  If the rest of the elements we NA, I would use a *
>>> c(rep(1,
>>>> 3), cumprod(f[3:9])), but that's not the case
>>>> o[6] = a[6] # Not NA anymore
>>>> o[7] = a[7]
>>>> o[8] = o[7]*f[7] # Again a[8] is NA
>>>> o[9] = o[8]*f[8]
>>>> o[10] = a[10] # Not NA
>>>>
>>>> Even though my explanation may seems complex, in reality the
>> requirement
>>> is
>>>> pretty simple and in Excel is achieved with a very short formula.
>>>>
>>>> The need to use R is to demonstrate capabilities of the language
>> and
>>> then to
>>>> expand to more complex problems.
>>>
>>> Hello:
>>>
>>> How is the output defined, if a[1] is NA?
>>>
>>> I think, you are not asking for a loop solution. However, in this
>> case,
>>> it can be a reasonable option. For example
>>>
>>>  a <- c(1, 2, 3, NA, NA, 6, 7, NA, NA, 10)
>>>  f <- c(0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1, 0.9, 1.1)
>>>  n <- length(a)
>>>  o <- rep(NA, times=n)
>>>
>>>  prev <- 1
>>>  for (i in 1:n) {
>>>      if (is.na(a[i])) {
>>>          o[i] <- f[i]*prev
>>>      } else {
>>>          o[i] <- a[i]
>>>      }
>>>      prev <- o[i]
>>>  }
>>>
>>> A more straightforward translation of the Excel formulas is
>>>
>>>  getCell <- function(i)
>>>  {
>>>      if (i == 0) return(1)
>>>      if (is.na(a[i])) {
>>>          return(f[i]*getCell(i-1))
>>>      } else {
>>>          return(a[i])
>>>      }
>>>  }
>>>
>>>  x <- rep(NA, times=n)
>>>  for (i in 1:n) {
>>>      x[i] <- getCell(i)
>>>  }
>>>
>>>  identical(o, x) # [1] TRUE
>>>
>>> Petr Savicky.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list