[Rd] Date class shows Inf as NA; this confuses the use of is.na()

Gabe Becker becker@g@be @ending from gene@com
Mon Jun 11 23:59:29 CEST 2018


Emil et al.,


On Mon, Jun 11, 2018 at 1:08 AM, Emil Bode <emil.bode using dans.knaw.nl> wrote:

> I don't think there's much wrong with is.na(as_date(Inf,
> origin='1970-01-01'))==FALSE, as there still is some "non-NA-ness" about
> the value (as difftime shows), but that the output when printing is
> confusing. The way cat is treating it is clearer: it does print Inf.
>
> So would this be a solution?
>
> format.Date <- function (x, ...)
> {
>   xx <- format(as.POSIXlt(x), ...)
>   names(xx) <- names(x)
>   xx[is.na(xx) & !is.na(x)] <- paste('Invalid date:',as.numeric(x[is.na(xx)
> & !is.na(x)]))
>   xx
> }
>
> Which causes this behaviour, which I think is clearer:
>
> environment(print.Date) <- .GlobalEnv
> x <- as_date(Inf, origin='1970-01-01')
> print(x)
> # [1] "Invalid date: Inf"
>

In my opinion, it's either invalid or it isn't. If it's actually invalid,
as_date (and the equivalent core function which is actually relevant on
this list) should fail; because it's an invalid date.

If it *isn't* invalid, having the print method tell users it is seems
problematic.

And I think people seem to be leaning towards it not being invalid. A bit
surprising to me, as my personal first thought was that infinite dates
don't make any sense, but I don't really have a horse in this race and so
defer to the cooler heads that are saying having an infinite date perhaps
should not be disallowed explicitly. If it's not, though, it's not invalid
and we shouldn't confuse users by saying it is, imho.

Best,
~G


>
> Best regards,
> Emil Bode
>
> Data-analyst
>
> +31 6 43 83 89 33
> emil.bode using dans.knaw.nl
>
> DANS: Netherlands Institute for Permanent Access to Digital Research
> Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 |
> info using dans.knaw.nl <mailto:info using dans.kn> | dans.knaw.nl
> <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and
> funding organisation NWO <http://www.nwo.nl/>.
>
> Who will be the winner of the Dutch Data Prize 2018? Go to researchdata.nl
> to nominate.
>
> On 09/06/2018, 13:52, "R-devel on behalf of Joris Meys" <
> r-devel-bounces using r-project.org on behalf of jorismeys using gmail.com> wrote:
>
>     And now I've seen I copied the wrong part of ?is.na
>
>     > The default method for is.na applied to an atomic vector returns a
>     logical vector of the same length as its argument x, containing TRUE
> for
>     those elements marked NA or, for numeric or complex vectors, NaN, and
> FALSE
>     otherwise.
>
>     Key point being "atomic vector" here.
>
>
>     On Sat, Jun 9, 2018 at 1:41 PM, Joris Meys <jorismeys using gmail.com>
> wrote:
>
>     > Hi Werner,
>     >
>     > on ?is.na it says:
>     >
>     > > The default method for anyNA handles atomic vectors without a
> class and
>     > NULL.
>     >
>     > I hear you, and it is confusing to say the least. Looking deeper, the
>     > culprit seems to be in the conversion of a Date to POSIXlt prior to
> the
>     > formatting:
>     >
>     > > x <- as.Date(Inf,origin = '1970-01-01')
>     > > is.na(as.POSIXlt(x))
>     > [1] TRUE
>     >
>     > Given this implicit conversion, I'd argue that as.Date should really
>     > return NA as well when passed an infinite value. The other option is
> to
>     > provide an is.na method for the Date class, which is -given is.na
> is an
>     > internal generic- rather trivial:
>     >
>     > > is.na.Date <- function(x) is.na(as.POSIXlt(x))
>     > > is.na(x)
>     > [1] TRUE
>     >
>     > This might be a workaround for your current problem without needing
>     > changes to R itself. But this will give a "wrong" answer in the
> sense that
>     > this still works:
>     >
>     > > Sys.Date() - x
>     > Time difference of -Inf days
>     >
>     > I personally would go for NA as the "correct" date for an infinite
> value,
>     > but given that this will have implications in other areas, there is a
>     > possibility of breaking code and it should be investigated a bit
> further
>     > imho.
>     > Cheers
>     > Joris
>     >
>     >
>     >
>     >
>     > On Fri, Jun 8, 2018 at 11:21 PM, Werner Grundlingh <
> wgrundlingh using gmail.com>
>     > wrote:
>     >
>     >> Indeed. as_date is from lubridate, but the same holds for as.Date.
>     >>
>     >> The output and it's interpretation should be consistent, otherwise
> it
>     >> leads
>     >> to confusion when programming. I understand that the difference
> exists
>     >> after asking a question on Stack Overflow:
>     >>   https://stackoverflow.com/q/50766089/914686
>     >> This understanding is never mentioned in the documentation - that
> an Inf
>     >> date is actually represented as NA:
>     >>   https://www.rdocumentation.org/packages/base/versions/3.5.0/
>     >> topics/as.Date
>     >> So I'm of the impression that the display should be fixed as a first
>     >> option
>     >> (thereby providing clarity/transparency in terms of back-end and
> output),
>     >> or the documentation amended (to highlight this) as a second option.
>     >>
>     >>         [[alternative HTML version deleted]]
>     >>
>     >> ______________________________________________
>     >> R-devel using r-project.org mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >>
>     >
>     >
>     >
>     > --
>     > Joris Meys
>     > Statistical consultant
>     >
>     > Department of Data Analysis and Mathematical Modelling
>     > Ghent University
>     > Coupure Links 653, B-9000 Gent (Belgium)
>     >
>     > <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-
> 9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>     >
>     > -----------
>     > Biowiskundedagen 2017-2018
>     > http://www.biowiskundedagen.ugent.be/
>     >
>     > -------------------------------
>     > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>     >
>
>
>
>     --
>     Joris Meys
>     Statistical consultant
>
>     Department of Data Analysis and Mathematical Modelling
>     Ghent University
>     Coupure Links 653, B-9000 Gent (Belgium)
>     <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-
> 9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
>     -----------
>     Biowiskundedagen 2017-2018
>     http://www.biowiskundedagen.ugent.be/
>
>     -------------------------------
>     Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
>         [[alternative HTML version deleted]]
>
>     ______________________________________________
>     R-devel using r-project.org mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker, Ph.D
Scientist
Bioinformatics and Computational Biology
Genentech Research

	[[alternative HTML version deleted]]



More information about the R-devel mailing list