[Rd] as.character.POSIXt in R devel

Suharto Anggono Suharto Anggono @uh@rto_@nggono @end|ng |rom y@hoo@com
Fri Oct 7 11:35:22 CEST 2022


 Yes, no documentation.
"POSIXlt" object with out-of-bounds components or whose components are not all of the same length may be produced internally by 'seq.POSIXt'.
Initially, 'r1' is a "POSIXlt" object whose all components have length 1.
Component 'year', 'mon', or 'mday' of 'r1' is then modified. It may have more than one elements. For 'mon' or 'mday', some may be out-of-bounds.


----------------------------
On Monday, 3 October 2022, 11:58:53 pm GMT+7, Martin Maechler <maechler using stat.math.ethz.ch> wrote:


>>>>> Martin Maechler
>>>>>    on Mon, 3 Oct 2022 14:46:08 +0200 writes:

>>>>> Suharto Anggono Suharto Anggono via R-devel
>>>>>    on Sun, 2 Oct 2022 08:42:50 +0000 (UTC) writes:

    >> With r82904, 'as.character.POSIXt' in R devel is changed. The NEWS item:

    >> as.character(<POSIXt>) now behaves more in line with the
    >> methods for atomic vectors such as numbers, and is no longer
    >> influenced by options().

[..............]

[snip]


    >> * Behavior with "improper" "POSIXlt" object:

    >> - "POSIXlt" object with out-of-bounds components is not normalized.

    >> Example (modified from regr.tests-1d.R):
    >> op <- options(scipen = 0) # (default setting)
    >> x <- structure(
    >> list(sec = 10000, min = 59L, hour = 18L,
    >> mday = 6L, mon = 11L, year = 116L,
    >> wday = 2L, yday = 340L,
    >> isdst = 0L, zone = "CET", gmtoff = 3600L),
    >> class = c("POSIXlt", "POSIXt"), tzone = "CET")
    >> as.character(x)
    >> # "2016-12-06 18:59:10000"
    >> format(x)
    >> # "2016-12-06 21:45:40"
    >> options(op)


    > Yes, we knew that  and were not too happy about it, but also not
    > too unhappy:
    > After all,            help(DateTimeClasses)
    > clearly explains how
    > POSIXlt objects should look like :

    > -------------------------------------------------------------------
    > Class ‘"POSIXlt"’ is a named list of vectors representing

    > ‘sec’ 0-61: seconds.
    > ‘min’ 0-59: minutes.
    > ‘hour’ 0-23: hours.
    > ‘mday’ 1-31: day of the month
    > ‘mon’ 0-11: months after the first of the year.
    > ‘year’ years since 1900.
    > ‘wday’ 0-6 day of the week, starting on Sunday.
    > ‘yday’ 0-365: day of the year (365 only in leap years).

    > ‘isdst’ Daylight Saving Time ... ... ...
    > ................................
    > ................................

    > -------------------------------------------------------------------

    > We have been aware that as.character() assumes the above specification,
    > even though other R functions, notably format() which uses
    > internal (C level; either system (OS) or R's own) strptime() do
    > arithmetic (modulo 60, then modulo 24, then modulo month length)
    > to compute the date "used".

    > Allowing such  "un-normalized" / out-of-bound  POSIXlt objects
    > in R has not been documented AFAICS, and has the consequence
    > that two different POSIXlt objects may correspond to the exact
    > same time.

    > This may be something worth discussing.
    > In some sense we are discussing how the "POSIXlt" class is defined
    > (even though an S3 class is never formally defined).

(nothing changed here)


    >> - With "POSIXlt" object where sec, min, hour, mday, mon,
    >> and year components are not all of the same length, recycling is not handled.

This is still the case... (see below).

    > Good point.  I tend to agree that this should be improved *and* also
    > documented: AFAIK, it is also not at all documented  (or is it ??)
    > that the POSIXlt components should be thought to be recycling.

    > If we decide we want that,
    > once this is documented (and all methods/functions tested with
    > such POSIXlt) it could also be used to use considerably smaller size
    > POSIXlt objects, e.g, when all parts are in the same year, or
    > when all seconds are 0, or ...

    >> Example (modified from regr.tests-1d.R):
    >> op <- options(scipen = 0) # (default setting)
    >> x <- structure(
    >> list(sec = c(1,  2), min = 59L, hour = 18L,
    >> mday = 6L, mon = 11L, year = 116L,
    >> wday = 2L, yday = 340L,
    >> isdst = 0L, zone = "CET", gmtoff = 3600L),
    >> class = c("POSIXlt", "POSIXt"), tzone = "CET")
    >> as.character(x)
    >> # c("2016-12-06 18:59:01", "NA NA:NA:02")
    >> format(x)
    >> # c("2016-12-06 18:59:01", "2016-12-06 18:59:02")
    >> options(op)

Note that currently such {needing recycling} - cases are
*also* not handled by the simple  (and important!)  length.POSIXlt()
method, either:  It currently only looks at the '.$sec'
component !

So this case does need discussion two.
I think it's unfortunate that *some* *.POSIXt methods do such
recycling, e.g. format.POSIXt,
but others do not {and the documentation does not even mention recycling}.

As mentioned, I am *pro* going in that direction;
so I would change

  length.POSIXlt <- function(x) length(unclass(x)[[1L]])

      (which only uses x$sec !)

to

  length.POSIXlt <- function(x) max(lengths(unclass(x), use.names=FALSE))

not allowing 0-length recycling; 0-lengths components
should really be illegal in an otherwise non-0-length POSIXlt x



Martin  
	[[alternative HTML version deleted]]



More information about the R-devel mailing list