[Rd] Suggestion: default print method for S3 generics could offer some insights on '...' among registered methods
Sebastian Meyer
@eb@meyer @end|ng |rom |@u@de
Wed Jun 11 10:25:54 CEST 2025
Am 10.06.25 um 21:10 schrieb Duncan Murdoch:
> On 2025-06-10 1:46 p.m., Michael Chirico wrote:
>> When it comes to adding more info in the help pages, we'd be remiss
>> not to point out the great engineering work the tidyverse folks have
>> put in to this end:
>>
>> https://dplyr.tidyverse.org/reference/mutate.html
>>
>>> Methods available in currently loaded packages: dbplyr (tbl_lazy), dplyr (data.frame) .
>>
>> https://github.com/tidyverse/dplyr/blob/9df9d63234f34837fa058bbd081a42dfb8a4eeb0/R/mutate.R#L78-L79
>> https://github.com/tidyverse/dplyr/blob/be3e3a05fd0081cb53168d6aedb417d62139b75d/R/doc-methods.R#L57-L75
>
> Those lines arise from this entry in the Rd file:
>
> \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("mutate")}
>
> The dplyr:::methods_rd function basically just formats the results of a
> call similar to `methods("mutate")`. All the rest of that is by R Core,
including former R-core members, of course! AFAICS, the Rd \Sexpr macro
was introduced in R 2.10.0 and most of it was originally implemented by
svn author "murdoch".
Sebastian Meyer
> not tidyverse folks.
>
> Duncan Murdoch
>
>
>>
>> On Mon, Jun 9, 2025 at 2:25 PM Lluís Revilla <lluis.revilla using gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I agree that showing that there are other methods might help.
>>> The print.function method could be modified to add this in addition to print.default output.
>>>
>>> But I guess (new) users would check the help page with ?as.data.frame and not print the method or use args() (if they don't check with their prefered LLM/agent).
>>> Would it be helpful to also report these numbers on the documentation page of the generic too?
>>>
>>> Best,
>>>
>>> Lluís
>>>
>>>
>>> PS: Trying to see what happens with ? and looking for a specific method I found that on my computer with R-devel (2025-06-08 r88288), the expressions below the "Not run:" on ? examples raise errors.
>>>
>>>
>>> On Mon, 9 Jun 2025 at 08:07, Michael Chirico <michaelchirico4 using gmail.com> wrote:
>>>>
>>>> Thanks Josh,
>>>>
>>>> With fresh eyes, it's definitely information overload for the
>>>> suggested output to take up more space than the function body itself.
>>>>
>>>> I'm not sure your suggestion gets at the heart of the issue, though,
>>>> which is about steering the user with regards to interpreting '...'
>>>> they see in the printout.
>>>>
>>>> Therefore I would suggest something like this as the economized
>>>> version of my original suggestion:
>>>>
>>>>> print(rbind)
>>>> function (..., deparse.level = 1)
>>>> # ...
>>>> <environment: namespace:base>
>>>> + 1 other method defines 3 other arguments. See methods(rbind).
>>>>
>>>>> print(as.data.frame)
>>>> function (x, row.names = NULL, optional = FALSE, ...)
>>>> # ...
>>>> <environment: namespace:base>
>>>> + 29 other methods define 11 other arguments. See methods(as.data.frame).
>>>>
>>>> Mike C
>>>>
>>>> On Sun, Jun 8, 2025 at 3:57 PM Joshua Ulrich <josh.m.ulrich using gmail.com> wrote:
>>>>>
>>>>> Hi Mike,
>>>>>
>>>>> On Fri, Jun 6, 2025 at 1:59 PM Michael Chirico
>>>>> <michaelchirico4 using gmail.com> wrote:
>>>>>>
>>>>>> There is a big difference in how to think of '...' for non-generic
>>>>>> functions like data.frame() vs. S3 generics.
>>>>>>
>>>>>> In the former, it means "any number of inputs" [e.g. columns]; in the
>>>>>> latter, it means "any number of inputs [think c()], as well as any
>>>>>> arguments that might be interpreted by class implementations".
>>>>>>
>>>>>> Understanding the difference for a given generic can require carefully
>>>>>> reading lots of documentation. print(<generic>), which is useful for
>>>>>> so many other contexts, can be a dead end.
>>>>>>
>>>>>> One idea is to extend the print() method to suggest to the reader
>>>>>> which other arguments are available (among registered generics). Often
>>>>>> ?<generic> will include the most common implementation, but not always
>>>>>> so.
>>>>>>
>>>>>> For rbind (in a --vanilla session), we currently have one method,
>>>>>> rbind.data.frame, that offers three arguments not present in the
>>>>>> generic: make.row.names, stringsAsFactors, and factor.exclude. The
>>>>>> proposal would be to mention this in the print(rbind) output somehow,
>>>>>> e.g.
>>>>>>
>>>>>>> print(rbind)
>>>>>> function (..., deparse.level = 1)
>>>>>> .Internal(rbind(deparse.level, ...))
>>>>>> <bytecode: 0x73d4fd824e20>
>>>>>> <environment: namespace:base>
>>>>>>
>>>>>> +Other arguments implemented by methods
>>>>>> + factor.exclude: rbind.data.frame
>>>>>> + make.row.names: rbind.data.frame
>>>>>> + stringsAsFactors: rbind.data.frame
>>>>>>
>>>>>> I suggest grouping by argument, not generic, although something like
>>>>>> this could be OK too:
>>>>>>
>>>>>> +Signatures of other methods
>>>>>> + rbind.data.frame(..., deparse.level = 1, make.row.names = TRUE,
>>>>>> stringsAsFactors = FALSE,
>>>>>> + factor.exclude = TRUE)
>>>>>>
>>>>>> Where it gets more interesting is when there are many methods, e.g.
>>>>>> for as.data.frame (again, in a --vanilla session):
>>>>>>
>>>>>>> print(as.data.frame)
>>>>>> function (x, row.names = NULL, optional = FALSE, ...)
>>>>>> {
>>>>>> if (is.null(x))
>>>>>> return(as.data.frame(list()))
>>>>>> UseMethod("as.data.frame")
>>>>>> }
>>>>>> <bytecode: 0x73d4fc1e70d0>
>>>>>> <environment: namespace:base>
>>>>>>
>>>>>> +Other arguments implemented by methods
>>>>>> + base: as.data.frame.table
>>>>>> + check.names: as.data.frame.list
>>>>>> + col.names: as.data.frame.list
>>>>>> + cut.names: as.data.frame.list
>>>>>> + fix.empty.names: as.data.frame.list
>>>>>> + make.names: as.data.frame.matrix, as.data.frame.model.matrix
>>>>>> + new.names: as.data.frame.list
>>>>>> + nm: as.data.frame.bibentry, as.data.frame.complex, as.data.frame.Date,
>>>>>> + as.data.frame.difftime, as.data.frame.factor, as.data.frame.integer,
>>>>>> + as.data.frame.logical, as.data.frame.noquote, as.data.frame.numeric,
>>>>>> + as.data.frame.numeric_version, as.data.frame.ordered,
>>>>>> + as.data.frame.person, as.data.frame.POSIXct, as.data.frame.raw
>>>>>> + responseName: as.data.frame.table
>>>>>> + sep: as.data.frame.table
>>>>>> + stringsAsFactors: as.data.frame.character, as.data.frame.list,
>>>>>> + as.data.frame.matrix, as.data.frame.table
>>>>>>
>>>>>> Or
>>>>>>
>>>>>> +Signatures of other methods
>>>>>> + as.data.frame.aovproj(x, ...)
>>>>>> + as.data.frame.array(x, row.names = NULL, optional = FALSE, ...)
>>>>>> + as.data.frame.AsIs(x, row.names = NULL, optional = FALSE, ...)
>>>>>> + as.data.frame.bibentry(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.character(x, ..., stringsAsFactors = FALSE)
>>>>>> + as.data.frame.citation(x, row.names = NULL, optional = FALSE, ...)
>>>>>> + as.data.frame.complex(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.data.frame(x, row.names = NULL, ...)
>>>>>> + as.data.frame.Date(x, row.names = NULL, optional = FALSE, ..., nm =
>>>>>> deparse1(substitute(x)))
>>>>>> + as.data.frame.default(x, ...)
>>>>>> + as.data.frame.difftime(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.factor(x, row.names = NULL, optional = FALSE, ..., nm
>>>>>> = deparse1(substitute(x)))
>>>>>> + as.data.frame.ftable(x, row.names = NULL, optional = FALSE, ...)
>>>>>> + as.data.frame.integer(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.list(x, row.names = NULL, optional = FALSE, ...,
>>>>>> cut.names = FALSE,
>>>>>> + col.names = names(x), fix.empty.names = TRUE, new.names =
>>>>>> !missing(col.names),
>>>>>> + check.names = !optional, stringsAsFactors = FALSE)
>>>>>> + as.data.frame.logical(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.logLik(x, ...)
>>>>>> + as.data.frame.matrix(x, row.names = NULL, optional = FALSE,
>>>>>> make.names = TRUE,
>>>>>> + ..., stringsAsFactors = FALSE)
>>>>>> + as.data.frame.model.matrix(x, row.names = NULL, optional = FALSE,
>>>>>> make.names = TRUE,
>>>>>> + ...)
>>>>>> + as.data.frame.noquote(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.numeric(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.numeric_version(x, row.names = NULL, optional =
>>>>>> FALSE, ..., nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.ordered(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.person(x, row.names = NULL, optional = FALSE, ..., nm
>>>>>> = deparse1(substitute(x)))
>>>>>> + as.data.frame.POSIXct(x, row.names = NULL, optional = FALSE, ...,
>>>>>> nm = deparse1(substitute(x)))
>>>>>> + as.data.frame.POSIXlt(x, row.names = NULL, optional = FALSE, ...)
>>>>>> + as.data.frame.raw(x, row.names = NULL, optional = FALSE, ..., nm =
>>>>>> deparse1(substitute(x)))
>>>>>> + as.data.frame.table(x, row.names = NULL, ..., responseName =
>>>>>> "Freq", stringsAsFactors = TRUE,
>>>>>> + sep = "", base = list(LETTERS))
>>>>>> + as.data.frame.ts(x, ...)
>>>>>>
>>>>>> Obviously that's a bit more cluttered, but as.data.frame() should be a
>>>>>> pretty unusual case. It also highlights better the differences in the
>>>>>> two approaches: the former economizes on space and focuses on what
>>>>>> sorts of arguments are available; the latter shows the defaults, does
>>>>>> not hide the arguments shared with the generic, and will always
>>>>>> produce as many lines as there are methods.
>>>>>>
>>>>>> There are other edge cases to think through (multiple registrations,
>>>>>> interactions with S4, primitives, ...), but I want to first check with
>>>>>> the list if this looks workable & valuable enough to pursue.
>>>>>>
>>>>> I like and appreciate the intent behind your suggestion, though I
>>>>> don't like all the extra output from printing the generic. I want to
>>>>> look at the function body when I print it. And as you show, it can
>>>>> output a lot of information you're probably not interested in.
>>>>>
>>>>> What about adding the number of methods to printed output for
>>>>> generics, and a suggestion to use `methods(some_generic)` to get a
>>>>> list of them? Then you can use help(some_method) or args(some_method)
>>>>> to get more information about the specific method(s) you're interested
>>>>> in.
>>>>>
>>>>> Best,
>>>>> Josh
>>>>>
>>>>>> Mike C
>>>>>>
>>>>>> ----
>>>>>>
>>>>>> Code that helped with the above:
>>>>>>
>>>>>> f = as.data.frame
>>>>>> # NB: methods() and getAnywhere() require {utils}
>>>>>> m = methods(f)
>>>>>> generic_args = names(formals(f))
>>>>>> f_methods = lapply(m, \(fn) getAnywhere(fn)$objs[[1L]])
>>>>>> names(f_methods) = m
>>>>>> new_args = sapply(f_methods, \(g) setdiff(names(formals(g)), generic_args))
>>>>>> with( # group by argument name
>>>>>> data.frame(method = rep(names(new_args), lengths(new_args)), arg =
>>>>>> unlist(new_args), row.names=NULL),
>>>>>> {tbl = tapply(method, arg, toString); writeLines(paste0(names(tbl),
>>>>>> ": ", tbl))}
>>>>>> )
>>>>>> signatures=sapply(f_methods, \(g) paste(head(format(args(g)), -1),
>>>>>> collapse="\n"))
>>>>>> writeLines(paste0(names(signatures), gsub("^\\s*function\\s*", "", signatures)))
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel using r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Joshua Ulrich | about.me/joshuaulrich
>>>>> FOSS Trading | www.fosstrading.com
>>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list