[Rd] Object.size() should not visit every element for alt-rep strings, or there should be an altstring_objectsize_method

Tierney, Luke |uke-t|erney @end|ng |rom u|ow@@edu
Thu Jan 31 14:35:34 CET 2019


You should really take this up with RStudio. Calling object.size on
every top level assignment as they appear to do is a bad idea, even
without ALTREP. object.size is only a cheap operation for simple
atomic vectors. For anything with recursive sturcture it needs to walk
the object, so the effort is proprtional to object size:

> x <- rep("A", 1e8)
> system.time(object.size(x))
    user  system elapsed
   1.222   0.624   1.850 
> x <- rep(list(1), 1e8)
> system.time(object.size(x))
    user  system elapsed
   1.247   0.022   1.273

The current help for object.size says

      Provides an estimate of the memory that is being used to store an
      R object.

If this is interpreted as the current memory use, which could change
in the ALTREP context (or for environments, though there the changes
are ignored), then we could define object.size for ALTREP objects to
avoid any ALTREP-specific computation. I'm not convinced yet that this
is a good idea, but it even if we do change this at the R level,
RStudio would still be well-advised to have another look at what they
are doing.

Best,

luke

On Tue, 15 Jan 2019, Travers Ching wrote:

>
> Below is a toy alt-rep string example, that generates N random strings:
>
> https://gist.github.com/traversc/a48a504eb062554f2d6ff8043ca16f9c
>
> example:
> `x <- altrandomStrings(1e8)`
> `head(x)`
> [1] "2PN0bdwPY7CA8M06zVKEkhHgZVgtV1" "5PN2qmWqBlQ9wQj99nsQzldVI5ZuGX" ...
> `object.size(1e8)`
>
> Object.size will call the `set_altstring_Elt_method` for every single
> element, materializing (slowly) every element of the vector.  This is
> a problem mostly in R-studio since object.size is called
> automatically, defeating the purpose of alt-rep entirely.
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list