[R] as.Date() results depend on order of data within vector?

Patrick Connolly p_connolly at ihug.co.nz
Sun Jan 7 20:42:32 CET 2007


On Sun, 07-Jan-2007 at 12:01PM +0000, Mark Wardle wrote:

|> Dear all,
|> 
|> The as.Date() function appears to give different results depending on
|> the order of the vector passed into it.
|> 
|> d1 = c("1900-01-01", "2007-01-01","","2001-05-03")
|> d2 = c("", "1900-01-01", "2007-01-01","2001-05-03")
|> as.Date(d1)	# gives correct results
|> as.Date(d2)	# fails with error (* see below)
|> 
|> This problem does not arise if the dates are NA rather than an empty
|> string, but my data is coming via RODBC and I still don't have NAs
|> passed across properly.
|> 
|> I might add that I initially noticed this behaviour when using RODBC's
|> sqlQuery() function call, and I initially had difficulty explaining why
|> one column of dates was passed correctly, but another failed. The
|> failing column was a "date of death" column where it was NA ("") for
|> most patients.
|> 
|> I've come up with two workarounds that work. The first is to sort the
|> data at the SQL level, ensuring the initial record is not null. The
|> second is to use sqlQuery() with as.is=T option, and then do the sorting
|> and conversion afterwards.

Simpler, I think, is to add one line
d2[d2 == ""] <- NA

I've not tested the idea extensively, so there might be occasions
where it falls down.  If you're working with a dataframe, you can use
one of the apply functions to effect all columns.


HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___    Patrick Connolly   
 {~._.~}          		 Great minds discuss ideas    
 _( Y )_  	  	        Middle minds discuss events 
(:_~*~_:) 	       		 Small minds discuss people  
 (_)-(_)  	                           ..... Anon
	  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.



More information about the R-help mailing list