[R] Conditional replacement and removal of data frame values

David Winsemius dwinsemius at comcast.net
Tue Sep 1 01:48:15 CEST 2015


On Aug 31, 2015, at 1:49 PM, Luigi Marongiu wrote:

> Dear all,
> I have a data frame and I would like to do the following:
> a) replace value of one variable "a" according to the value of another one "b"
> b) remove all the instances of the variable "b"
> 
> For the sake of argument, let's say I have the following data frame:
> test <- rep(c("Adenovirus", "Rotavirus", "Norovirus", "Rotarix",
> "Sapovirus"), 3)
> res <- c(0, 1, 0, 0, 1,
>         1, 0, 1, 1, 0,
>         0, 1, 0, 1, 0)
> samp <- c(rep(1, 5), rep(2, 5), rep(3, 5))
> df <- data.frame(test, res, samp, stringsAsFactors = FALSE)
> 
> The task I need is to coerce the results of the "Rotavirus" to
> negative (0) if and only if "Rotarix" is positive (1). In this
> example, the results shows that for "samp" 3 "Rotavirus" should be 0:
>    test           res samp
> 2  Rotavirus   1    1
> 4  Rotarix       0    1
> 7  Rotavirus    0    2
> 9  Rotarix       1    2
> 12 Rotavirus   1    3
> 14 Rotarix       1    3
> 
> I can't use the subset function because then I would work on a
> separate object and I don't know how to implement the conditions for
> the replacements.
> Finally, all the "Rotarix" entries should be removed from the data frame.

From context it appears you want to do this testing within groups determined by 'samp', so you might choose to use an lapply-split approach:

lapply( split(df, df$samp), 
       FUN=function(d) if ( d[d$test =="Rotarix", "res"] ) { d$res[d$test=="Rotavirus"] <- 0 ; return( d[!d$test=="Rotarix", ] ) } else { d[!d$test=="Rotarix", ]} )
$`1`
        test res samp
1 Adenovirus   0    1
2  Rotavirus   1    1
3  Norovirus   0    1
5  Sapovirus   1    1

$`2`
         test res samp
6  Adenovirus   1    2
7   Rotavirus   0    2
8   Norovirus   1    2
10  Sapovirus   0    2

$`3`
         test res samp
11 Adenovirus   0    3
12  Rotavirus   0    3
13  Norovirus   0    3
15  Sapovirus   0    3

It's pretty easy to rbind.data.frame those together

> do.call( rbind.data.frame,  lapply( split(df, df$samp), FUN=function(d) if ( d[d$test =="Rotarix", "res"] ) { d$res[d$test=="Rotavirus"] <- 0 ; return( d[!d$test=="Rotarix", ] ) } else { d[!d$test=="Rotarix", ]} ) )
           test res samp
1.1  Adenovirus   0    1
1.2   Rotavirus   1    1
1.3   Norovirus   0    1
1.5   Sapovirus   1    1
2.6  Adenovirus   1    2
2.7   Rotavirus   0    2
2.8   Norovirus   1    2
2.10  Sapovirus   0    2
3.11 Adenovirus   0    3
3.12  Rotavirus   0    3
3.13  Norovirus   0    3
3.15  Sapovirus   0    3



> Thank you.
> Best regards,
> Luigi
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list