[R] If statement - copying a factor variable to a new variable

peter dalgaard pdalgd at gmail.com
Thu Jun 28 20:58:37 CEST 2012


On Jun 28, 2012, at 09:42 , Rui Barradas wrote:

> Hello,
> 
> Another way is to use index vectors:
> 
> 
> v1.factor <- c("S","S","D","D","D",NA)
> v2.factor <- c("D","D","S","S","S","S")
> 
> td2 <- test.data <- data.frame(v1.factor,v2.factor)
> 
> for (i in 1:nrow(test.data) ) {
> 
>    [... etc ...]
> 
> } #End FOR
> 
> # Create index vectors
> na1 <- is.na(v1.factor)
> na2 <- is.na(v2.factor)
> 
> # Create 'newvar' with default value
> td2$newvar <- NA
> # Now, set values if condition is met.
> td2$newvar[!na1 & !na2] <- as.character(td2$v1.factor[!na1 &!na2])
> 
> all.equal(test.data, td2)
> [1] TRUE
> 

Shouldn't this rather be something like

new <- td2$v1.factor
i <- is.na(new)
new[i] <- td2$v2.factor[i]
td2$newvar <- new

?? 
(Caveat: things could go wrong if the levels differ.)

> 
> I find this way better when doing a multiple if/else based on combinations of a small number of conditions. It's also very readable.
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Em 28-06-2012 08:00, Miguel Manese escreveu:
>> Hi James,
>> 
>> On Thu, Jun 28, 2012 at 12:33 AM, James Holland <holland.aggie at gmail.com> wrote:
>>> I need to look through a dataset with two factor variables, and depending
>>> on certain criteria, create a new variable containing the data from one of
>>> those other variables.
>>> 
>>> The problem is, R keeps making my new variable an integer and saving the
>>> data as a 1 or 2 (I believe the levels of the factor).
>>> 
>>> I've tried using as.factor in the IF output statement, but that doesn't
>>> seem to work.
>>> 
>>> Any help is appreciated.
>>> 
>>> 
>>> 
>>> #Sample code
>>> 
>>> rm(list=ls())
>>> 
>>> 
>>> v1.factor <- c("S","S","D","D","D",NA)
>>> v2.factor <- c("D","D","S","S","S","S")
>>> 
>>> test.data <- data.frame(v1.factor,v2.factor)
>> 
>> The vectorized way to do that would be
>> 
>> # v1.factor if present, else v2.factor
>> test.data$newvar <- ifelse(!is.na(v1.factor), v1.factor, v2.factor)
>> 
>> I suggest you work with the character levels first then convert it
>> into a factor, e.g. if v1.factor & v2.factor are already factors, do:
>> 
>> test.data$newvar <- as.factor(ifelse(!is.na(v1.factor),
>> as.character(v1.factor), as.character(v2.factor)))
>> 
>> 
>> 
>> Regards,
>> 
>> Jon
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list