[Rd] class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Pages, Herve hp@ge@ @end|ng |rom |redhutch@org
Wed Jan 29 19:13:02 CET 2020


On 1/27/20 23:51, Martin Maechler wrote:
>>>>>> Pages, Herve
>>>>>>      on Tue, 21 Jan 2020 17:33:01 +0000 writes:
> 
>      > Dear Martin,
>      > What's the ETA for _R_CLASS_MATRIX_ARRAY_=TRUE to become the new
>      > unconditional behavior in R devel? Thanks!
> 
>      > H.
> 
> Thank you, Hervé, for asking / reminding.
> 
> It has been made so now, 3 days ago (svn r77714).

Yep, I've seen that. Already deployed on the Bioconductor build 
machines. Thanks!

H.

> 
> Martin
> 
> 
> 
> 
>      > On 11/21/19 08:57, Martin Maechler wrote:
>      >>
>      >> TLDR: This is quite technical, still somewhat important:
>      >> 1)  R 4.0.0 will become a bit more coherent: a matrix is an array
>      >> 2)  Your package (or one you use) may be affected.
>      >>
>      >>
>      >>>>>>> Martin Maechler
>      >>>>>>> on Fri, 15 Nov 2019 17:31:15 +0100 writes:
>      >>
>      >>>>>>> Pages, Herve
>      >>>>>>> on Thu, 14 Nov 2019 19:13:47 +0000 writes:
>      >>
>      >> >> On 11/14/19 05:47, Hadley Wickham wrote:
>      >> >>> On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler ... wrote:
>      >>
>      >> [................]
>      >>
>      >> >>>>> Note again that both "matrix" and "array" are special [see ?class] as
>      >> >>>>> being of  __implicit class__  and I am considering that this
>      >> >>>>> implicit class behavior for these two should be slightly
>      >> >>>>> changed ....
>      >> >>>>>
>      >> >>>>> And indeed I think you are right on spot and this would mean
>      >> >>>>> that indeed the implicit class
>      >> >>>>> "matrix" should rather become c("matrix", "array").
>      >> >>>>
>      >> >>>> I've made up my mind (and not been contradicted by my fellow R
>      >> >>>> corers) to try go there for  R 4.0.0   next April.
>      >>
>      >> >>> I can't seem to find the previous thread, so would you mind being a
>      >> >>> bit more explicit here? Do you mean adding "array" to the implicit
>      >> >>> class?
>      >>
>      >> >> It's late in Europe ;-)
>      >>
>      >> >> That's my understanding. I think the plan is to have class(matrix())
>      >> >> return c("matrix", "array"). No class attributes added to matrix or
>      >> >> array objects.
>      >>
>      >> >> It's all what is needed to have inherits(matrix(), "array") return TRUE
>      >> >> (instead of FALSE at the moment) and S3 dispatch pick up the foo.array
>      >> >> method when foo(matrix()) is called and there is no foo.matrix method.
>      >>
>      >> > Thank you, Hervé!  That's exactly the plan.
>      >>
>      >> BUT it's wrong what I (and Peter and Hervé and ....) had assumed:
>      >>
>      >> If I just change the class
>      >> (as I already did a few days ago, but you must activate the change
>      >> via environment variable, see below),
>      >>
>      >> S3 dispatch does *NOT* at all pick it up:
>      >> "matrix" (and "array") are even more special here (see below),
>      >> and from Hadley's questions, in hindsight I now see that he's been aware
>      >> of that and I hereby apologize to Hadley for not having thought
>      >> and looked more, when he asked ..
>      >>
>      >> Half an hour ago, I've done another source code commit (svn r77446),
>      >> to "R-devel" only, of course, and the R-devel NEWS now starts as
>      >>
>      >> ------------------------------------------------------------
>      >>
>      >> CHANGES IN R-devel:
>      >>
>      >> USER-VISIBLE CHANGES:
>      >>
>      >> •  .... intention that the next non-patch release should be 4.0.0.
>      >>
>      >> • R now builds by default against a PCRE2 library ........
>      >> ...................
>      >> ...................
>      >>
>      >> • For now only active when environment variable
>      >> _R_CLASS_MATRIX_ARRAY_ is set to non-empty, but planned to be the
>      >> new unconditional behavior when R 4.0.0 is released:
>      >>
>      >> Newly, matrix objects also inherit from class "array", namely,
>      >> e.g., class(diag(1)) is c("matrix", "array") which invalidates
>      >> code (wrongly) assuming that length(class(obj)) == 1, a wrong
>      >> assumption that is less frequently fulfilled now.  (Currently
>      >> only after setting _R_CLASS_MATRIX_ARRAY_ to non-empty.)
>      >>
>      >> S3 methods for "array", i.e., <someFun>.array(), are now also
>      >> dispatched for matrix objects.
>      >>
>      >> ------------------------------------------------------------
>      >> (where only the very last 1.5 lines paragraph is new.)
>      >>
>      >> Note the following
>      >> (if you use a version of R-devel, with svn rev >= 77446; which
>      >> you may get as a binary for Windows in about one day; everyone
>      >> else needs to compile for the sources .. or wait a bit, maybe
>      >> also not much longer than one day, for a docker image) :
>      >>
>      >>
>      >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_") # ==> current R behavior
>      >>> class(m <- diag(1))
>      >> [1] "matrix"
>      >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
>      >>> class(m)
>      >> [1] "matrix" "array"
>      >>>
>      >>> foo <- function(x) UseMethod("foo")
>      >>> foo.array <- function(x) "made in foo.array()"
>      >>> foo(m)
>      >> [1] "made in foo.array()"
>      >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_")# ==> current R behavior
>      >>> foo(m)
>      >> Error in UseMethod("foo") :
>      >> no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')"
>      >>
>      >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = TRUE) # ==> future R behavior
>      >>> foo(m)
>      >> [1] "made in foo.array()"
>      >>> foo.A <- foo.array ; rm(foo.array)
>      >>> foo(m)
>      >> Error in UseMethod("foo") :
>      >> no applicable method for 'foo' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
>      >>>
>      >>
>      >> So, with my commit 77446, the  _R_CLASS_MATRIX_ARRAY_
>      >> environment variable also changes the
>      >>
>      >> "S3 dispatch determining class"
>      >>
>      >> mentioned as 'class' in the error message (of the two cases, old
>      >> and new) above,  which in R <= 3.6.x for a numeric matrix is
>      >>
>      >> c('matrix', 'double', 'numeric')
>      >>
>      >> and from R 4.0.0 on  will be
>      >>
>      >> c('matrix', 'array', 'double', 'numeric')
>      >>
>      >> Note that this is *not* (in R <= 3.6.x, nor very probably in R 4.0.0)
>      >> the same as  R's  class().
>      >> Hadley calls this long class vector the  'implicit class' -- which
>      >> is a good term but somewhat conflicting with R's (i.e. R-core's)
>      >> "definition" used in the  ?class  help page (for ca. 11 years).
>      >>
>      >> R's internal C code has a nice function class R_data_class2()
>      >> which computes this 'S3-dispatch-class' character (vector) for
>      >> any R object, and R_data_class2() is indeed called from (the
>      >> underlying C function of)  R's UseMethod().
>      >>
>      >> Using the above fact of an error message,
>      >> I wrote a nice (quite well tested) function  my.class2()  which
>      >> returns this S3_dispatch_class() also in current versions of R:
>      >>
>      >> my.class2 <- function(x) { # use a fn name not used by any sane ..
>      >> foo.7.3.343 <- function(x) UseMethod("foo.7.3.343")
>      >> msg <- tryCatch(foo.7.3.343(x), error=function(e) e$message)
>      >> clm <- sub('"$', '', sub(".* of class \"", '', msg))
>      >> if(is.language(x) || is.function(x))
>      >> clm
>      >> else {
>      >> cl <- str2lang(clm)
>      >> if(is.symbol(cl)) as.character(cl) else eval(cl)
>      >> }
>      >> }
>      >>
>      >> ## str2lang() needs R >= 3.6.0:
>      >> if(getRversion() < "3.6.0") ## substitute for str2lang(), good enough here:
>      >> str2lang <- function(s) parse(text = s, keep.source=FALSE)[[1]]
>      >>
>      >>
>      >> Now you can look at such things yourself:
>      >>
>      >> ## --------------------- the "interesting" cases : ---
>      >> ## integer and double
>      >> my.class2( pi) 	# == c("double",  "numeric")
>      >> my.class2(1:2) 	# == c("integer", "numeric")
>      >> ## matrix and array [also combined with int / double ] :
>      >> my.class2(matrix(1L, 2,3))   	# == c(matrixCL, "integer", "numeric")  <<<
>      >> my.class2(matrix(pi, 2,3))   	# == c(matrixCL,  "double", "numeric")  <<<
>      >> my.class2(array("A", 2:3))   	# == c(matrixCL,  "character")          <<<
>      >> my.class2(array(1:24, 2:4))  	# == c("array",  "integer", "numeric")
>      >> my.class2(array( pi , 2:4))  	# == c("array",   "double", "numeric")
>      >> my.class2(array(TRUE, 2:4))  	# == c("array", "logical")
>      >> my.class2(array(letters, 2:4))	# == c("array", "character")
>      >> my.class2(array(1:24 + 1i, 2))	# == c("array", "complex")
>      >>
>      >> ## other cases
>      >> my.class2(NA) 	# == class(NA) : "logical"
>      >> my.class2("A") 	# == class("B"): "character"
>      >> my.class2(as.raw(0:2)) 	# == "raw"
>      >> my.class2(1 + 2i) 	# == "complex"
>      >> my.class2(USJudgeRatings)#== "data.frame"
>      >> my.class2(class) 	# == "function" # also for a primitive
>      >> my.class2(globalenv()) 	# == "environment"
>      >> my.class2(quote(sin(x)))# == "call"
>      >> my.class2(quote(sin) )  # == "name"
>      >> my.class2(quote({}))	# == class(*) == "{"
>      >> my.class2(quote((.)))	# == class(*) == "("
>      >>
>      >> -----------------------------------------------------
>      >>
>      >> note that of course, the lines marked "<<<" above, contain
>      >> 'matrixCL'  which is "matrix" in "old" (i.e. current) R,
>      >> and is c("matrix", "array") in "new" (i.e. future) R.
>      >>
>      >> Last but not least: It's quite trivial (only few words need to
>      >> be added to the sources; more to the documentation)  to add an R
>      >> function to base R which provides the same as my.class2() above,
>      >> (but much more efficiently, not via catching error messages !!),
>      >> and my current proposal for that function's name is  .class2()
>      >> {it should start with a dot ("."), as it's not for the simple
>      >> minded average useR ... and you know how I'm happy with
>      >> function names that do not need one single [Shift] key ...}
>      >>
>      >> The current plan contains
>      >>
>      >> 1)  Notify CRAN package maintainers (ca 140) whose packages no
>      >> longer pass R CMD check  when the feature is turned on
>      >> (via setting the environment variable) in R-devel.
>      >>
>      >> 2a) (Some) CRAN team members set _R_CLASS_MATRIX_ARRAY_ (to non-empty),
>      >> as part of the incoming checks, at least for all new CRAN submissions
>      >>
>      >> 2b) set the  _R_CLASS_MATRIX_ARRAY_ (to non-empty), as part of
>      >> ' R CMD check --as-cran <pkg>'
>      >>
>      >> 3)  Before the end of 2019, change the R sources (for R-devel)
>      >> such that it behaves as it behaves currently when the environment
>      >> variable is set *AND* abolish this environment variable from
>      >> the sources.  {read on to learn *why*}
>      >>
>      >> Consequently (to 3), R 4.0.0 will behave as indicated, unconditionally.
>      >>
>      >> Note that (as I've shown above in the first example set) this is
>      >> set up in such a manner that you can change the environment
>      >> variable during a *running* R session, and observe the effect immediately.
>      >> This however lead to some slow down of quite a bit of the R
>      >> code, because actually the environment variable has to be
>      >> checked quite often (easily dozens of times for simple R calls).
>      >>
>      >> For that reason, we want to do "3)" as quickly as possible.
>      >>
>      >> Please do not hesitate to ask or comment
>      >> -- here, not on Twitter, please --  noting that I'll be
>      >> basically offline for an extended weekend within 24h, now.
>      >>
>      >> I hope this will eventually to lead to clean up and clarity in
>      >> R, and hence should be worth the pain of broken
>      >> back-compatibility and having to adapt your (almost always only
>      >> sub-optimally written ;-)) R code,
>      >> see also my Blog   https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_R-5Fblog-5Fclass-5Fthink-5F2x&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=xAGXmo1FhJxT-qBfj-McDEn3sqWhqJHNV-IPpN7g6oA&s=yUUwdjl5LE90V0tLTM3FZYZ0zHf8coHo49Vt95O7IwQ&e=
>      >>
>      >> Martin Maechler
>      >> ETH Zurich and R Core team
>      >>
> 
>      > --
>      > Hervé Pagès
> 
>      > Program in Computational Biology
>      > Division of Public Health Sciences
>      > Fred Hutchinson Cancer Research Center
>      > 1100 Fairview Ave. N, M1-B514
>      > P.O. Box 19024
>      > Seattle, WA 98109-1024
> 
>      > E-mail: hpages using fredhutch.org
>      > Phone:  (206) 667-5791
>      > Fax:    (206) 667-1319
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319


More information about the R-devel mailing list