[R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers

arun smartpink111 at yahoo.com
Thu Nov 15 17:25:57 CET 2012


Hi,

df1<-read.table(text="
col1 col2 col3
A   15.5   8.5
A   8.5    7.5
A   NA     NA
B   8.0   6.0
B   NA     NA
B   9.0   10.0
",sep="",header=TRUE,stringsAsFactors=FALSE)
 str(df1)
#'data.frame':    6 obs. of  3 variables:
# $ col1: chr  "A" "A" "A" "B" ...
# $ col2: num  15.5 8.5 NA 8 NA 9
# $ col3: num  8.5 7.5 NA 6 NA 10

 df1$col2[is.na(df1$col2)]<-0
 df1$col3[is.na(df1$col3)]<-1
 df1
#  col1 col2 col3
#1    A 15.5  8.5
#2    A  8.5  7.5
#3    A  0.0  1.0
#4    B  8.0  6.0
#5    B  0.0  1.0
#6    B  9.0 10.0

#or if you want to use ifelse() from the original df1

 ifelse(is.na(df1$col2),0,df1$col2)
#[1] 15.5  8.5  0.0  8.0  0.0  9.0
 ifelse(is.na(df1$col3),1,df1$col2)
#[1] 15.5  8.5  1.0  8.0  1.0  9.0
A.K.




----- Original Message -----
From: David Romano <dromano at stanford.edu>
To: r-help at r-project.org
Cc: 
Sent: Thursday, November 15, 2012 6:19 AM
Subject: [R] using ifelse to remove NA's from specific columns of a data frame containing strings and numbers

Hi everyone,

I have a data frame one of whose columns is a character vector and the rest
are numeric, and in debugging a script, I noticed that an ifelse call seems
to be coercing the character column to a numeric column, and producing
unintended values as a result.   Roughly, here's what I tried to do:

df: a data frame with, say, the first column as a character column and the
second and third columns numeric.

also: NA's occur only in the numeric columns, and if they occur in one,
they occur in the other as well.

I wanted to replace the NA's in column 2 with 0's and the ones in column 3
with 1's, so first I did this:

> na.replacements <-ifelse(col(df)==2,0,1).

Then I used a second ifelse call to try to remove the NA's as I wanted,
first by doing this:

> clean.df <- ifelse(is.na(df), na.replacements, df),

which produced a list of lists vaguely resembling df, with the NA's mostly
intact, and so then I tried this:

> clean.df <- ifelse(is.na(df), na.replacements, unlist(df)),

which seems to work if all the columns are numeric, but otherwise changes
strings to numbers.

I can't make sense of the help documentation enough to clear this up, but
my guess is that the "yes" and "no" values passed to ifelse need to be
vectors, in which case it seems I'll have to use another approach entirely,
but even if is not the case and lists are acceptable, I'm not sure how to
convert a mixed-mode data frame into a vector-like list of elements (which
I would hope would work).

I'd be grateful for any suggestions!

Thanks,
David Romano

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list