[Rd] Inefficiency in df$col

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Feb 4 17:34:34 CET 2019


>>>>> peter dalgaard 
>>>>>     on Mon, 4 Feb 2019 16:48:12 +0100 writes:

    > Does either of you have a patch against current R-devel? 
    > I tried the obvious, but the build dies with

    > building package 'tools'
    > all.R is unchanged
    > ../../../../library/tools/libs/x86_64/tools.so is unchanged
    > installing 'sysdata.rda'
    > Error in get(method, envir = home) : object '$.data.frame' not found
    > Error: unable to load R code in package 'tools'
    > Execution halted

    > ...and I can't really be arsed to dig into tools to see exactly where it is hardcoding the existence of $.data.frame.

    > -pd

Well, we two have been working "in parallel"...

I've just sent an e-mail to R-core about this:

It's really  file.size() which does need a 'data.frame' method for `$`
because it is basically a wrapper  file.info(..)$size  and
file.info(..) does return a data frame.

I've been suggesting the byte compiler's optimizations here to be
the problem ...

    >> On 4 Feb 2019, at 15:32 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
    >> 
    >> On 04/02/2019 9:20 a.m., Radford Neal wrote:
    >>>>> I think you might want to just delete the definition of $.data.frame,
    >>>>> reverting to the situation before R-3.1.0.
    >>>> 
    >>>> I imagine the cause is that the list version is done in C code rather
    >>>> than R code (i.e. there's no R function `$.list`).  So an alternative
    >>>> solution would be to also implement `$.data.frame` in the underlying C
    >>>> code.  This won't be quite as fast (it needs that test for NULL), but
    >>>> should be close in the full match case.
    >>> I maybe wasn't completely clear.  The $ operator for data frames was
    >>> previously done in C - since it was done by the same primitive as for
    >>> lists.  In R-3.1.0, this was changed - producing a massive slowdown -
    >>> for the purpose of giving a warning on partial matches even if the
    >>> user had not set the warnPartialMatchDollar option to TRUE.  In
    >>> R-3.1.1, this was changed to not warn unless warnPartialMatchDollar was
    >>> TRUE which was the PREVIOUS behaviour.  In other words, this change
    >>> reverted the change made in R-3.1.0.  But instead of simply deleting
    >>> the definition of $.data.frame, R-3.1.1 added extra code to it, the
    >>> only effect of which is to slightly change the wording of the warning
    >>> message from what is produced for any other list, while still retaining
    >>> the massive slowdown.
    >>> There is no need for you to write $.data.frame in C.  You just need
    >>> to delete the version written in R.
    >> 
    >> Sorry, I did misunderstand.  Thanks for the clarification.
    >> 
    >> But if the "You" in your last sentence meant me, it needs to be "They": I am not a member of R Core and can't make any changes to the sources.
    >> 
    >> Duncan Murdoch
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > -- 
    > Peter Dalgaard, Professor,
    > Center for Statistics, Copenhagen Business School
    > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    > Phone: (+45)38153501
    > Office: A 4.23
    > Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list