[R] Checking for invalid dates: Code works but needs improvement

Rui Barradas ruipbarradas at sapo.pt
Mon Jan 30 20:32:04 CET 2012


Hello,
I'm glad it helped.


> 
> Error in if (any(is.na(x) & M != "un" & Y != "un")) cat("Warning: Invalid
> date values in",  :
> missing value where TRUE/FALSE needed
> 
> Why is this happening? If the code correctly correctly handles the date
> "06/20/1840" without producing an error,
> why can't it do likelwise with "05/16/2015"?
> 

Because "un" is greater than "2012". The problem starts with the 'ifelse',
we are changing the Y values to NA's
if the condition is met.
Correction: instead of ifelse use

Known <- M != "un" & Y != "un"
Y[nchar(Y) > 4 | Y > "2012" | Y < "1900"] <- NA

Now you need to place the conjuntion of Known and is.na(x) in both the 'if'
and 'cat' statements.
(Only dates with known year and month but wrongly keyed in will be printed.)


if(any(Known & is.na(x)))
            cat("Warning: Invalid date values in", jj, "\n",
                as.character(DF[Known & is.na(x), jj]), "\n") 

Like David said, 'any' expects comma separated logical vectors but in this
case the conjunction  agrees more with what is wanted.
It's an adversative conjunction, that reads "Known BUT is.na(x)".
The point is that according to your code the unknown shouldn't be printed.
They are just NA,  not invalid.

Does this also answer to the second question?

As for the third question, there are so few columns that I don't believe the
for loop can hurt.
There could be a problem in changing the class to Date using *apply because
R passes arguments to functions by value and only the copy inside the
function would be changed.
Considering the number of iterations in the loop, it's maybe simpler like
this.

Rui Barradas


--
View this message in context: http://r.789695.n4.nabble.com/Checking-for-invalid-dates-Code-works-but-needs-improvement-tp4341018p4342250.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list