[R] Vectorizing a loop

David Winsemius dwinsemius at comcast.net
Tue Feb 7 19:04:01 CET 2012


On Feb 7, 2012, at 12:56 PM, Jeff Newmiller wrote:

> On Tue, 7 Feb 2012, Alexander Shenkin wrote:
>
>> Hello Folks,
>>
>> I'm trying to vectorize a loop that processes rows of a dataframe.   
>> It
>> involves lots of conditionals, such as "If column 10 == 3, and if  
>> column
>> 3 is True, and both column 5 and 6 are False, then set column 4 to  
>> True".
>>
>> So, for example, any ideas about vectorizing the following?
>>
>> df = data.frame( list(a=c(1,2,3,4), b=c("a","b","c","d"),  
>> c=c(T,F,T,F),
>> d=NA, e=c(F,F,T,T)) )
>>
>> for (i in 1:nrow(df)) {
>>
>>   if (df[i,3] %in% c(FALSE,NA) & (df[i,1] > 2 | df[i,5]) ) {
>>       df[i,4] = 1
>>   }
>>
>>   if (df[i,5] %in% c(TRUE, NA) & df[i,2] == "b") {
>>       df[i,4] = 2
>>       df[i,5] = T
>>   }
>>
>> }
>
> Your code attempts to do some things with NA that won't behave the way
> you expect them to. Specifically, you cannot use %in% to test for NA,

Huh?

 > NA %in% NA
[1] TRUE
 > NA %in% c(5, NA)
[1] TRUE
 > NA %in% c(5, 6)
[1] FALSE

-- 
David.
> and you cannot give the "if" function an NA.  It only appears to  
> work because you don't actually give it a complete set of test  
> values consistent with your tests in the loop. My guess at your  
> intent is:
>
> df <- data.frame( list( a=c(1,2,3,4,5)
>                      , b=c("a","b","c","d","e")
>                      , c=c(TRUE,FALSE,TRUE,FALSE,NA)
>                      , d=NA
>                      , e=c(FALSE,FALSE,TRUE,TRUE,NA)
>                      ) )
> tmpdf <- df
>
> for (i in 1:nrow(df)) {
>
>    if ( ( is.na(df[i,3]) || !df[i,3] ) && ( df[i,1] > 2 ||  
> ( is.na( df[i,5] ) || df[i,5] ) ) ) {
>        df[i,4] <- 1
>    }
>
>    if ( ( is.na( df[i,5] ) || df[i,5] ) && df[i,2] == "b" ) {
>        df[i,4] <- 2
>        df[i,5] <- TRUE
>    }
>
> }
>
> df2 <- df
> df <- tmpdf
>
> # intermediate logical vectors for clarity
> tmp <- ( is.na(df[[3]]) | !df[[3]] ) & ( df[[1]] > 2 | df[[5]] )
> tmp2 <- ( is.na(df[[5]]) | df[[5]] ) & df[[2]] == "b"
> df[ tmp, "d" ] <- 1
> df[ tmp2, "d" ] <- 2
> df[ tmp2, "e" ] <- TRUE
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go  
> Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.   
> Live Go...
>                                      Live:   OO#.. Dead: OO#..   
> Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.   
> rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list