[R] Unintended behaviour of stats::time not returning integers for the first cycle

Andreï V. Kostyrka @ndre|@ko@tyrk@ @end|ng |rom gm@||@com
Tue Oct 18 14:26:28 CEST 2022


Sure, this works, and I was thinking about this solution, but it seems like
a dirty one-time trick. I was wondering whether the following 3 lines could
be considered for inclusion by the core developers, but did not know which
mailing list to write to. Here is my proposal:

correctTime <- function (x, offset = 0, ...) { # Changes
stats:::time.default
  n <- if (is.matrix(x)) nrow(x) else length(x)
  xtsp <- attr(hasTsp(x), "tsp")
  y <- seq.int(xtsp[1L], xtsp[2L], length.out = n) + offset/xtsp[3L]
  round.y <- round(y)
  near.integer <- abs(round.y - y) < sqrt(.Machine$double.eps)
  y[near.integer] <- round.y[near.integer]
  tsp(y) <- xtsp
  y
}

x <- ts(2:252, start = c(2002, 2), freq = 12)
d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by =
"month")
true.year <- rep(2002:2022, each = 12)[-1]
wrong.year <- floor(as.numeric(time(x)))
print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of
which is 2021
print(correctTime(x)[240], 20) # 2022

On Sat, Oct 15, 2022 at 11:56 AM Eric Berger <ericjberger using gmail.com> wrote:

> Alternatively
>
> correct.year <- floor(time(x)+1e-6)
>
> On Sat, Oct 15, 2022 at 10:26 AM Andreï V. Kostyrka <
> andrei.kostyrka using gmail.com> wrote:
>
>> Dear all,
>>
>>
>>
>> I was using stats::time to obtain the year as a floor of it, and
>> encountered a problem: due to a rounding error (most likely due to its
>> reliance on the base::seq.int internally, but correct me if I am wrong),
>> the actual number corresponding to the beginning of a year X can still be
>> (X-1).9999999..., resulting in the following undesirable behaviour.
>>
>>
>>
>> One of the simplest ways of getting the year from a ts object is
>> floor(time(...)). However, if the starting time cannot be represented
>> nicely as a power of 2, then, supposedly integer time does not have a
>> .000000... mantissa:
>>
>>
>>
>> x <- ts(2:252, start = c(2002, 2), freq = 12)
>>
>> d <- seq.Date(as.Date("2002-02-01"), to = as.Date("2022-12-01"), by =
>> "month")
>>
>> true.year <- rep(2002:2022, each = 12)[-1]
>>
>> wrong.year <- floor(as.numeric(time(x)))
>>
>> tail(cbind(as.character(d), true.year, wrong.year), 15) # Look at
>> 2022-01-01
>>
>> print(as.numeric(time(x))[240], 20) # 2021.9999999999997726, the floor of
>> which is 2021
>>
>>
>>
>> Yes, I have read the 'R inferno' book and know the famous '0.3 != 0.7 -
>> 0.4' example, but I believe that the expected / intended behaviour would
>> be
>> actually returning round years for the first observation in a year. This
>> could be achieved by rounding the near-integer time to integers.
>>
>>
>>
>> Since users working with dates are expecting to get exact integer years
>> for
>> the first cycle of a ts, this should be changed. Thank you in advance for
>> considering a fix.
>>
>>
>>
>> Yours sincerely,
>>
>> Andreï V. Kostyrka
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list