[Rd] inconsistency between as.list(df) and as.list(mat) with mode(mat) == "list"

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Feb 1 18:58:54 CET 2022


>>>>> Gabriel Becker 
>>>>>     on Mon, 31 Jan 2022 12:11:10 -0800 writes:

  (using an HTMLifying mail client .... so I've manually pretty edited a bit)

    > Hi All,

    > I ran into the following the other day:

    >> mat <- matrix(1:6, nrow = 2)
    >> as.list(mat)
    > [[1]]
    > [1] 1

    > *<snip>*

    > [[6]]
    > [1] 6

    >> mat2 <- mat
    >> mode(mat2) <- "list"
    >> as.list(mat2)
    >   [,1] [,2] [,3]
    > [1,] 1    3    5
    > [2,] 2    4    6


    > I realize this is not guaranteed by the documentation, and the behavior is
    > technically (if I would argue fairly subtly) as documented. Generally,
    > however, as.list returns something without dimensions (other than length),
    > regardless of the dimensions of the input.

    > Furthermore, this behavior agrees with neither the data.frame (which are
    > lists) method nor the non-list-mode matrix behavior which comes from the
    > default behavior. Both result in a non-dimensioned object (the data.frame
    > method explicitly and intentionally so).

    > Matrices of mode "list" are fairly rare, in practice, I would think, but I
    > wonder if the as.list behavior for them should agree with that of similar
    > dimensioned objects (data.frames and non-list-mode matrices). As a user, I
    > certainly expected it to, and had to read the docs with a careful eye
    > before I realized what was happening and why.

    > For the record, as.vector  does not drop dimension (or anything else) from
    > data.frames nor list-matrices, so there the behaviors agree, although we do
    > get:

    >> is.vector(mat)
    > [1] FALSE

    >> is.vector(mat2)
    > [1] FALSE

    >> is.vector(mtcars)
    > [1] FALSE


    > Which does make the fact that for the latter two as.vector returns the
    > objects unmodified somewhat puzzling.

    > I wonder if as.list and as.vector could get a strict argument - it could
    > default to FALSE for a deprecation period, or forever if preferred by
    > R-core -  where attributes are always stripped for 'strict' conversions.

    > Also, as a final aside, the documentation at ?as.list says:

    > Attributes may be
    > dropped unless the argument already is a list or expression.

    > (This is inconsistent with functions such as ‘as.character’ which
    > always drop attributes, *and is for efficiency since lists can be*
    > *     expensive to copy.*)

    > (emphasis mine). Is this still the case with shallow duplication? I was
    > under the impression that it was not.

Well, you are entering the topic Kurt Hornik and I  tried to
improve on, 2  months ago  and then had to give up (for the time
being) with only a small step of progress;  at the time
producing extra work for CRAN team members who saw many dozens
of CRAN package failing just because we tried to change
is.vector() / as.vector()  to become slightly less inconsistent.

There were many misuse problems in these CRAN packages,
which basically used  is.vector(obj) to check if `obj` was not
a matrix.

During ca. one week in early December 2021, we (mostly me) tried
several things and had to conditionalize (via a 
environment variable you must set *before* starting R) in the
end most of the change, because we saw too much R code out
there, being based on wrong assumptions ...
------------------------------------------------------------------------
r81299 | maechler | 2021-12-06 13:21:26 +0100 (Mon, 06. Dec 2021) | 1 Zeile
Geänderte Pfade:
   M /trunk/doc/NEWS.Rd
   M /trunk/src/library/base/man/vector.Rd
   M /trunk/src/main/coerce.c
   M /trunk/tests/demos.Rout.save
   M /trunk/tests/reg-tests-1d.R

conditionalize most as.vector/is.vector changes from 81252,81270,81274,81285-6
------------------------------------------------------------------------

I mentioned above that one problem that useRs use is.vector() when they
shouldn't -- because they are not aware that list() and
expression()s  also fulfill `is.vector()`.
I would have recommended to use (is.atomic() && !is.array())
instead conceptually called is.simplevector() in my mind.

But there's another fact which dirties the water further:
is.atomic() actually does *not* check for atomic vectors,
but for  "atomic vector _OR_ NULL"  which I've found unfortunate.

Since then, I've contemplated introducing a new primitive
is.atomicV()  which really is true only if its argument is an
atomic vector.
One thing not so nice is its name. To make that even longer is
strongly against my taste ("testing for 'atom' should be short
and succinct ")  so maybe people would agree with   is.atom()

... yes, I've somewhat hijacked your thread to talk about part
of the underlying problem(s) that I would like to address first.


Martin



More information about the R-devel mailing list