problems with missing values created by conversion using as.matri (PR#2130)
Wed, 9 Oct 2002 18:54:03 +0200 (MET DST)

> version
platform sparc-sun-solaris2.8
arch     sparc               
os       solaris2.8          
system   sparc, solaris2.8   
major    1                   
minor    6.0                 
year     2002                
month    10                  
day      01                  
language R                   


Create a very simple data frame containing an factor and a character vector
each containing a missing value:

	> x <- data.frame( a=c("",NA), b=c(1,NA) ) 

Conversion to a matrix treats the two missing values differently:

	> as.matrix(x)
	  a  b   
	1 "" " 1"
	2 NA "NA"

The missing value in the factor variable has been correctly converted to a
missing value, while the missing value in the numeric vector has been
incorrectly converted to a string "NA", which is not recognized as a missing

	      a     b

This turned up because I was using lapply to check for rows containing only
blank or missing values:

	> all.blank <- function(x) all( | (x <= " ") )
	> blanks <- apply(x, 1, all.blank)
	> blanks
	    1     2 

This should have yielded

	> blanks
	    1     2 

BTW direct conversion using as.character doesn't show any problems when
applied to the individual columns:

	> as.character(x$a)
	[1] "" NA
	> as.character(x$b)
	[1] "1" NA 

I think the problem is that is using format() to
convert things to characters, which is resulting in a "NA" string and not a
missing value.  

Why isn't it using as.character() for this?

For completeness here's the patch to make this change, but I have not
explored what other side effects this might have.

*** R-1.6.0/src/library/base/R/dataframe.R      Thu Aug 29 03:41:42 2002
--- R-1.6.0-GRW//src/library/base/R/dataframe.R Wed Oct  9 12:29:11 2002
*** 931,937 ****
            if (is.character(X[[j]]))
            xj <- X[[j]]
!           X[[j]] <- if(length(levels(xj))) as.vector(xj) else format(xj)
      X <- unlist(X, recursive = FALSE, use.names = FALSE)
--- 931,937 ----
            if (is.character(X[[j]]))
            xj <- X[[j]]
!           X[[j]] <- if(length(levels(xj))) as.vector(xj) else
      X <- unlist(X, recursive = FALSE, use.names = FALSE)


Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.

r-devel mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: