[R] applying ifelse to dataframe

Peter Ehlers ehlers at ucalgary.ca
Tue Jun 22 10:02:29 CEST 2010


On 2010-06-22 1:45, steven mosher wrote:
> Hmm
>
>>
> DF<-data.frame(name=rep(1:5,each=2),x1=rep("A",10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1))
>    DF$x3[5]<-50
>   mask<-apply(sample,2,"%in%", target)

This is getting confusing. What's 'sample'?
What's 'target'? Probably what you originally called 'targets'.

>    DF
>     name x1 x2 x3 x4
> 1     1  A 10 NA 20
> 2     1  A 11 NA 21
> 3     2  A 12 NA 22
> 4     2  A 13 NA 23
> 5     3  A 14 50 24
> 6     3  A 15 NA 25
> 7     4  A 16 NA 26
> 8     4  A 17 NA 27
> 9     5  A 18 NA 28
> 10    5  A 19 NA 29
>    mask
>        [,1]  [,2]  [,3]  [,4]  [,5]
> [1,] FALSE FALSE FALSE FALSE FALSE
> [2,] FALSE FALSE FALSE FALSE FALSE
> [3,]  TRUE  TRUE FALSE  TRUE FALSE
> [4,] FALSE FALSE FALSE FALSE FALSE
> [5,]  TRUE FALSE FALSE FALSE FALSE


This suggests that 'sample' may be a matrix, not
a dataframe.

Anyway, try this on your original problem:

  targets<-c(11,12,13,16,19,50,27,24,22,26)
  mask<-apply(DF[,3:5],2, "%in%" ,targets)
  is.na(DF[3:5]) <- !mask

   -Peter Ehlers

>    mask<-data.frame(a=TRUE,b=TRUE,!mask)
>    DF[mask]<-NA
> Error in FUN(X[[1L]], ...) :
>    only defined on a data frame with all numeric variables
>    DF2<-data.frame(DF[,3:5])
>    mask<-apply(sample,2,"%in%", target)
>    mask<-data.frame(!mask)
>    DF2[mask]<-NA
> Error in FUN(X[[1L]], ...) :
>    only defined on a data frame with all numeric variables
>    DF2
>     x2 x3 x4
> 1  10 NA 20
> 2  11 NA 21
> 3  12 NA 22
> 4  13 NA 23
> 5  14 50 24
> 6  15 NA 25
> 7  16 NA 26
> 8  17 NA 27
> 9  18 NA 28
> 10 19 NA 29
>    mask<-apply(DF2,2,"%in%", target)
>    mask<-data.frame(!mask)
>    DF2[mask]<-NA
> Error in FUN(X[[1L]], ...) :
>    only defined on a data frame with all numeric variables
>
> On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKAL<petr.pikal at precheza.cz>  wrote:
>
>> Hi
>>
>> r-help-bounces at r-project.org napsal dne 22.06.2010 08:28:04:
>>
>>> The following dataframe will illustrate the problem
>>>
>>>
>>
>> DF<-data.frame(name=rep(1:5,each=2),x1=rep("A",10),x2=seq(10,19,by=1),x3=rep
>>> (NA,10),x4=seq(20,29,by=1))
>>>   DF$x3[5]<-50
>>>
>>>   # we have a data frame. we are interested in the columns x2,x3,x4 which
>>> contain sparse
>>>   # values and many NA.
>>>   DF
>>>     name x1 x2 x3 x4
>>> 1     1  A 10 NA 20
>>> 2     1  A 11 NA 21
>>> 3     2  A 12 NA 22
>>> 4     2  A 13 NA 23
>>> 5     3  A 14 50 24
>>> 6     3  A 15 NA 25
>>> 7     4  A 16 NA 26
>>> 8     4  A 17 NA 27
>>> 9     5  A 18 NA 28
>>> 10    5  A 19 NA 29
>>>
>>> # we have a list of "target values that we want to search for in the
>> data
>>> frame
>>> # if the value is in the data frame we want to keep it there, otherwise,
>>>   replace it with NA
>>>
>>> targets<-c(11,12,13,16,19,50,27,24,22,26)
>>> # so we apply a test by column to the last 3 columns using the "in" test
>>> # this gives us a mask of whether the data frame 'contains' elements in
>> the
>>> # target list
>>>
>>> mask<-apply(DF[,3:5],2, "%in%" ,targets)
>>> mask
>>>
>>>           x2    x3    x4
>>>   [1,] FALSE FALSE FALSE
>>>   [2,]  TRUE FALSE FALSE
>>>   [3,]  TRUE FALSE  TRUE
>>>   [4,]  TRUE FALSE FALSE
>>>   [5,] FALSE  TRUE  TRUE
>>>   [6,] FALSE FALSE FALSE
>>>   [7,]  TRUE FALSE  TRUE
>>>   [8,] FALSE FALSE  TRUE
>>>   [9,] FALSE FALSE FALSE
>>> [10,]  TRUE FALSE FALSE
>>>
>>> # and so DF[2,3] is equal to 11 and 11 is in the target list, so the
>> mask is
>>> True
>>> # now something like DF<- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I
>> want
>>
>> Data frames are quite clever in preserving their dimensions. I would do
>>
>> mask=data.frame(a=TRUE, b=TRUE, !mask)
>>
>> to add column 1 and 2
>>
>> and
>>
>> DF[mask]<-NA
>>
>> Regards
>> Petr
>>
>>
>>> to do
>>> in the end I'd  Like a result that looks like
>>>
>>>     name x1 x2 x3 x4
>>> 1     1  A NA NA NA
>>> 2     1  A 11 NA NA
>>> 3     2  A 12 NA 22
>>> 4     2  A 13 NANA
>>> 5     3  A NA 50 24
>>> 6     3  A NA NA NA
>>> 7     4  A 16 NA 26
>>> 8     4  A NA NA 27
>>> 9     5  A NA NA NA
>>> 10    5  A 19 NA NA
>>>
>>> Ive tried forcing the DF and the mask into vectors so that ifelse()
>> would
>>> work
>>> and have tried "apply" using ifelse.. without much luck. any thoughts?
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>



More information about the R-help mailing list