(PR#1577) [Rd] is.na<- coerces character vectors to be factors

ripley@stats.ox.ac.uk ripley@stats.ox.ac.uk
Mon, 20 May 2002 13:46:35 +0200 (MET DST)


The inconsistency is that you use $<- to set the column, then [[<- to
change it.  Had you tried to set the column by

x[[1]] <- as.character(x[[1]])

you would have seen the problem immediately (it does not work as you would
have intended).  If you want to be sure to turn off conversion to factor,
you need to set the column to class "AsIs".  My belief is that
will behave consistently.

> x <- data.frame(var = I(LETTERS[1:3]))
> is.na(x[[1]]) <- 2
> x
   var
1    A
2 <NA>
3    C
>  is.character(x$var)
[1] TRUE
> is.factor(x$var)
[1] FALSE

In any case, it is nothing to do with is.na<-:

> x <- data.frame(var = LETTERS[1:3])
> x$var <- as.character(x$var)
>
> x[[1]][2] <- "3"
> x
  var
1   A
2   3
3   C
> is.character(x$var)
[1] FALSE

You'll see that [[<-.data.frame contains

            if (!inherits(value, "data.frame"))
                value <- as.data.frame(value)

and that is where the coercion is done.  It really isn't possible to tell
your intentions if you replace a whole column of a data frame.

Having said all that, the source says

    if(nargs() < 4) {
	## really ambiguous, but follow common use as if list
	## el(x,i) <- value is the preferred approach

and S4 does not do the conversion.

I don't understand the comment, as

> get("el<-")
.Primitive("[[<-")

and so el<- just calls [[<- ....



On Mon, 20 May 2002 a296180@mica.fmr.com wrote:

> I am not sure if this is a bug within is.na<- or if it lies deeper in the
> dataframe construction process. Indeed, perhaps it is not a bug at all (in
> which case I would suggest that the help page for NA be provided with a warning
> for unsuspecting users (like me)).

Not appropriate!

> When used on a character vector within a dataframe, is.na<- coerces the vector
> to factor.
>
> > x <- data.frame(var = LETTERS[1:3])
> > x$var <- as.character(x$var)
> > x
>   var
> 1   A
> 2   B
> 3   C
> > is.character(x$var)
> [1] TRUE
> > is.na(x[[1]]) <- 2
> > x
>    var
> 1    A
> 2   <NA>
> 3    C
> > is.character(x$var)
> [1] FALSE
> > is.factor(x$var)
> [1] TRUE
> >
>
> Interestingly enough, this coersion does not occur if you refer to x$var
> instead of x[[1]].
>
> > x <- data.frame(var = LETTERS[1:3])
> > x$var <- as.character(x$var)
> > is.na(x$var) <- 2
> > x
>    var
> 1    A
> 2   <NA>
> 3    C
> > is.character(x$var)
> [1] TRUE
> > is.factor(x$var)
> [1] FALSE
> >
>
> I could (ort of) imagine a story in which the coercision is the desired
> behavior --  by using is.na you are implicitly taking apart a dataframe and
> putting it back together and, when you make dataframes, character vectors are
> coerced to factor by default. But I can't come up with a story as to why x$var
> should be handled differently then x[[1]].
>
> > R.version
>          _
> platform sparc-sun-solaris2.6
> arch     sparc
> os       solaris2.6
> system   sparc, solaris2.6
> status
> major    1
> minor    5.0
> year     2002
> month    04
> day      29
> language R
> >
>
> Thanks,
>
> David Kane
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._