[R] Unintended behaviour of stats::time not returning integers for the first cycle

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Oct 19 11:44:56 CEST 2022


>>>>> Martin Maechler 
>>>>>     on Wed, 19 Oct 2022 10:05:31 +0200 writes:

>>>>> Andreï V Kostyrka 
>>>>>     on Tue, 18 Oct 2022 16:26:28 +0400 writes:

    >> Sure, this works, and I was thinking about this solution, but it seems like
    >> a dirty one-time trick. I was wondering whether the following 3 lines could
    >> be considered for inclusion by the core developers, but did not know which
    >> mailing list to write to. 

    > As Jeff alluded to, *every* message to this list has a footer
    > with a link to *the POSTING GUIDE"  ...

    > and from there you quickly learn it is 'R-devel' (instead of 'R-help').

    > Now that we have already half a dozen messages here, let's keep
    > the whole thread here, even if only for ease of reading the list archives(!)

    >> Here is my proposal:

    >> correctTime <- function (x, offset = 0, ...) { # Changes
    >> stats:::time.default
    >> n <- if (is.matrix(x)) nrow(x) else length(x)
    >> xtsp <- attr(hasTsp(x), "tsp")
    >> y <- seq.int(xtsp[1L], xtsp[2L], length.out = n) + offset/xtsp[3L]
    >> round.y <- round(y)
    >> near.integer <- abs(round.y - y) < sqrt(.Machine$double.eps)
    >> y[near.integer] <- round.y[near.integer]
    >> tsp(y) <- xtsp
    >> y
    >> }

    > Yes, some such change does make sense to me, too.
    > As the computations above are relatively costly (compared to the
    > current  time.default()  implementation),
    > and also for strict back compatibility reasons, I think the
    > correction should only happen when the user asks for it,  say by
    > using a new argument 'roundYear = TRUE'  (where the default
    > remains roundYear=FALSE).

    > Martin Maechler
    > ETH Zurich  and  R Core tam

After some more thinking and pondering:

No, there's no need for a 'roundYear = *' argument, but rather
we'd use the 'ts.eps' argument as in many similar situations
with ts() objects needing rounding adjustments.

Consequently, my current (only little tested) proposal  is

time.default <- function (x, offset = 0, ts.eps = getOption("ts.eps"), ...)
{
    xtsp <- attr(hasTsp(x), "tsp")
    y <- seq.int(xtsp[1L], xtsp[2L], length.out = NROW(x)) + offset/xtsp[3L]
    if(ts.eps > 0) {
        iy <- round(y)
        nearI <- abs(iy - y) < ts.eps
        y[nearI] <- iy[nearI]
    }
    tsp(y) <- xtsp
    y
}

It *does* fix your example(s) below.

Martin


    >> x <- ts(2:252, start = c(2002, 2), freq = 12)
    >> d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by =
    >> "month")
    >> true.year <- rep(2002:2022, each = 12)[-1]
    >> wrong.year <- floor(as.numeric(time(x)))
    >> print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of
    >> which is 2021
    >> print(correctTime(x)[240], 20) # 2022

    >> On Sat, Oct 15, 2022 at 11:56 AM Eric Berger <ericjberger using gmail.com> wrote:

    >>> Alternatively
    >>> 
    >>> correct.year <- floor(time(x)+1e-6)
    >>> 
    >>> On Sat, Oct 15, 2022 at 10:26 AM Andreï V. Kostyrka <
    >>> andrei.kostyrka using gmail.com> wrote:
    >>> 
    >>>> Dear all,
    >>>> 
    >>>> 
    >>>> 
    >>>> I was using stats::time to obtain the year as a floor of it, and
    >>>> encountered a problem: due to a rounding error (most likely due to its
    >>>> reliance on the base::seq.int internally, but correct me if I am wrong),
    >>>> the actual number corresponding to the beginning of a year X can still be
    >>>> (X-1).9999999..., resulting in the following undesirable behaviour.
    >>>> 
    >>>> 
    >>>> 
    >>>> One of the simplest ways of getting the year from a ts object is
    >>>> floor(time(...)). However, if the starting time cannot be represented
    >>>> nicely as a power of 2, then, supposedly integer time does not have a
    >>>> .000000... mantissa:
    >>>> 
    >>>> 
    >>>> 
    >>>> x <- ts(2:252, start = c(2002, 2), freq = 12)
    >>>> 
    >>>> d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by =
    >>>> "month")
    >>>> 
    >>>> true.year <- rep(2002:2022, each = 12)[-1]
    >>>> 
    >>>> wrong.year <- floor(as.numeric(time(x)))
    >>>> 
    >>>> tail(cbind(as.character(d), true.year, wrong.year), 15) # Look at
    >>>> 2022-01-01
    >>>> 
    >>>> print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of
    >>>> which is 2021
    >>>> 
    >>>> 
    >>>> 
    >>>> Yes, I have read the 'R inferno' book and know the famous '0.3 != 0.7 -
    >>>> 0.4' example, but I believe that the expected / intended behaviour would
    >>>> be
    >>>> actually returning round years for the first observation in a year. This
    >>>> could be achieved by rounding the near-integer time to integers.
    >>>> 
    >>>> 
    >>>> 
    >>>> Since users working with dates are expecting to get exact integer years
    >>>> for
    >>>> the first cycle of a ts, this should be changed. Thank you in advance for
    >>>> considering a fix.
    >>>> 
    >>>> 
    >>>> 
    >>>> Yours sincerely,
    >>>> 
    >>>> Andreï V. Kostyrka
    >>>> 
    >>>> [[alternative HTML version deleted]]
    >>>> 
    >>>> ______________________________________________
    >>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    >>>> https://stat.ethz.ch/mailman/listinfo/r-help
    >>>> PLEASE do read the posting guide
    >>>> http://www.R-project.org/posting-guide.html
    >>>> and provide commented, minimal, self-contained, reproducible code.
    >>>> 
    >>> 

    >> [[alternative HTML version deleted]]

    >> ______________________________________________
    >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    >> and provide commented, minimal, self-contained, reproducible code.

    > ______________________________________________
    > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list