[R] is.na() can coerce character vectors to be factors within a dataframe

David Kane <David Kane a296180 at mica.fmr.com
Thu May 16 23:49:57 CEST 2002


Thanks to Brian Ripley for suggesting, to my previous post about a problem with
merge, that I trace through merge.data.frame. I did so with my test case and
all seemed to be well until I got to:

        if (all.x) 
            for (i in seq(along = y)) is.na(y[[i]]) <- (lxy + 
                1):(lxy + nxx)

I believe that this code sets observations in y (which has been expanded to be the same
size as x) that should be NA (because they were not in y) to NA. I think that
this works fine, except for variables in y that are character. In that case, it
converts them to factor.

Consider a simple example:

> test <- data.frame(var = LETTERS[1:3])
> test$var <- as.character(test$var)
> test
  var
1   A
2   B
3   C
> is.na(test[[1]]) <- 2
> test
   var
1    A
2   <NA>
3    C
> is.factor(test$var)
[1] TRUE

Note that no problems arise if var is factor or numeric. This does not seem
to be a problem with character vectors.

> z <- LETTERS[1:3]
> z
[1] "A" "B" "C"
> is.na(z) <- 2
> z
[1] "A" NA  "C"
> is.factor(z)
[1] FALSE

The truly strange thing (at least for me) is that, as far as R is concerned,
test[[1]] and z are identical.

> z <- LETTERS[1:3]
> test <- data.frame(var = LETTERS[1:3])
> test$var <- as.character(test$var)
> identical(z, test[[1]])
[1] TRUE

So, why is.na() would coerce to factor for one but not for the other is a bit
of a mystery to me. Presumably it's a two stage process whereby first the NA is
inserted in the character vector var but then, when var is placed back into
test, some sort of forced conversion takes place. But I really shouldn't
speculate (further!) about something I know so little about.

Question:

1) Is the conversion of character vectors to factor vectors within dataframes a
   feature or a bug of is.na()? Or am I misunderstanding the whole situation?

 
Thanks,

David Kane



-- 
David Kane
Geode Capital Management
617-563-0122
david.d.kane at fmr.com
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list