[Rd] [External] Re: ALTREP wrappers and factors

Tierney, Luke |uke-t|erney @end|ng |rom u|ow@@edu
Fri Jul 19 14:34:48 CEST 2019


On Fri, 19 Jul 2019, Gabriel Becker wrote:

> Hi Jiefei and Kylie,
>
> Great to see people engaging with the ALTREP framework and identifying
> places we may need more tooling. Comments inline.
>
> On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <szwjf08 using gmail.com> wrote:
>
>>
>> If that is the case and you are 100% sure the reference number should be 1
>> for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
>> the reference number. Note that you need to unbind your local variable
>> before you reset the number. To return an unbound SEXP,  the C++ function
>> should be placed at the end of your *matter:::as.altrep *function. I don't
>> know if there is any simpler way to do that and I'll be happy to see any
>> opinion.
>>
>
> So as far as I know, manually setting the NAMED value on any SEXP the
> garbage collector is aware of is a direct violation of C-API contract and
> not something that package code should ever be doing.
>
> Its not at all clear to me that you can *ever* be 100% sure that the
> reference number should be 1 when it is not currently one for an R object
> that exists at the R-level (as opposed to only in pure C code). Sure, maybe
> the object is created within the body of your R function instead of being
> passed in, but what if someone is debugging your function and assigns the
> value to the global environment using <<-  for later inspection; now  you
> have an invalidly low NAMED value, ie you have a segfault coming. I know of
> no way for you to prevent this or even know it has happened.

SET_NAMED should NEVER be used in a package. In fact it will hopefully
disappear at some point not too far in the future.

>> On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <k.bemis using northeastern.edu>
>> wrote:
>>
>>> Hello,
>>>
>>> I’m experimenting with ALTREP and was wondering if there is a preferred
>>> way to create an ALTREP wrapper vector without using
>>> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
>>> .Internal() function.
>>
>
> So there is the .doSortWrap  (and its currently inexplicably identical
> clone .doWrap) function in base, which is an R level function that calls
> down to .Internal(wrap_meta(...)), which you can use, but it doesn't look
> general enough for what  I think you need (it was written for things that
> have just been sorted, thus the name). Specifically, its not able to
> indicate that things are of unknown sortedness as currently written.  If
> matter vectors are guaranteed to be sorted for some reason, though, you can
> use this. I'll talk to Luke about whether we want to generalize this, it
> would be easy to have this support the full space of metadata for wrappers
> and be a general purpose wrapper-maker, but that isn't what it is right now.
>
> At the C-level, it looks like we do make R_tryWrap available (it appears in
> Rinternals.h, and not within a USE_RINTERNALS section),so you can call that
> from your own C(++) code. This creates a wrapper that has no metadata on it
> (or rather it has metadata but  the metadata indicates that no special info
> is known about the vector).

At this point we are not ready to cast in stone an interface to
creating wrappers from R.  The C R_tryWrap could be used, but it is
still subject to change.

You might try your example with a larger vector. In R 3.6.x
structure() should produce a wrapper for length 100 or more.

Best,

luke

>>
>>> I was trying to create a factor that used an ALTREP integer, but
>>> attempting to set the class and levels attributes always ended up
>>> duplicating and materializing the integer vector. Using the wrapper
>> avoided
>>> this issue.
>>>
>>> Here is my initial ALTREP integer vector:
>>>
>>>> fc0 <- factor(c("a", "a", "b"))
>>>>
>>>> y <- matter::as.matter(as.integer(fc0))
>>>> y <- matter:::as.altrep(y)
>>>>
>>>> .Internal(inspect(y))
>>> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
>> mem=0)
>>>
>>> Here is what I get without a wrapper:
>>>
>>>> fc1 <- structure(y, class="factor", levels=levels(x))
>>>> .Internal(inspect(fc1))
>>> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
>>> ATTRIB:
>>>   @7fb0ce771868 02 LISTSXP g0c0 []
>>>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
>>> value)
>>>     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>>>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
>>> "factor"
>>>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
>> "levels"
>>> (has value)
>>>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>>>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>>>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>>>
>>> Here is what I get with a wrapper:
>>>
>>>> fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
>>> levels=levels(x))
>>>> .Internal(inspect(fc2))
>>> @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
>>>   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
>>> mem=0)
>>> ATTRIB:
>>>   @7fb0ce764668 02 LISTSXP g0c0 []
>>>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
>>> value)
>>>     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>>>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
>>> "factor"
>>>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
>> "levels"
>>> (has value)
>>>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>>>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>>>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>>>
>>> Is there a way to do this that doesn’t rely on .Internal() and won’t
>>> produce R CMD check warnings?
>>>
>>> ~~~
>>> Kylie Ariel Bemis
>>> Khoury College of Computer Sciences
>>> Northeastern University
>>> kuwisdelu.github.io<https://kuwisdelu.github.io>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list