[Rd] ALTREP wrappers and factors

Bemis, Kylie k@bem|@ @end|ng |rom northe@@tern@edu
Fri Aug 16 17:50:20 CEST 2019


Using R_tryWrap() at the C-level works perfectly and does what I need. Thanks, Gabe!

Yes, my reference count is maxed (I assume) because I am using MARK_NOT_MUTABLE().

Which makes me think I may want to return a wrapped matter/ALTREP object by default, so the user can set the names() and dim(), etc., without triggering a potentially-costly duplication. The data payload is intended to be immutable, but the attributes aren’t.

Decoupling the attributes and other metadata from the data payload seems like a good thing to have generally.

Are there any potential drawbacks of using R_tryWrap() that I should know about, besides an additional method dispatch happening somewhere?

Thanks again!

~~~
Kylie Ariel Bemis
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>










On Jul 19, 2019, at 4:00 AM, Gabriel Becker <gabembecker using gmail.com<mailto:gabembecker using gmail.com>> wrote:

Hi Jiefei and Kylie,

Great to see people engaging with the ALTREP framework and identifying places we may need more tooling. Comments inline.

On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <szwjf08 using gmail.com<mailto:szwjf08 using gmail.com>> wrote:

If that is the case and you are 100% sure the reference number should be 1
for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
the reference number. Note that you need to unbind your local variable
before you reset the number. To return an unbound SEXP,  the C++ function
should be placed at the end of your *matter:::as.altrep *function. I don't
know if there is any simpler way to do that and I'll be happy to see any
opinion.

So as far as I know, manually setting the NAMED value on any SEXP the garbage collector is aware of is a direct violation of C-API contract and not something that package code should ever be doing.

Its not at all clear to me that you can ever be 100% sure that the reference number should be 1 when it is not currently one for an R object that exists at the R-level (as opposed to only in pure C code). Sure, maybe the object is created within the body of your R function instead of being passed in, but what if someone is debugging your function and assigns the value to the global environment using <<-  for later inspection; now  you have an invalidly low NAMED value, ie you have a segfault coming. I know of no way for you to prevent this or even know it has happened.



On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <k.bemis using northeastern.edu<mailto:k.bemis using northeastern.edu>>
wrote:

> Hello,
>
> I’m experimenting with ALTREP and was wondering if there is a preferred
> way to create an ALTREP wrapper vector without using
> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> .Internal() function.

So there is the .doSortWrap  (and its currently inexplicably identical clone .doWrap) function in base, which is an R level function that calls down to .Internal(wrap_meta(...)), which you can use, but it doesn't look general enough for what  I think you need (it was written for things that have just been sorted, thus the name). Specifically, its not able to indicate that things are of unknown sortedness as currently written.  If matter vectors are guaranteed to be sorted for some reason, though, you can use this. I'll talk to Luke about whether we want to generalize this, it would be easy to have this support the full space of metadata for wrappers and be a general purpose wrapper-maker, but that isn't what it is right now.

At the C-level, it looks like we do make R_tryWrap available (it appears in Rinternals.h, and not within a USE_RINTERNALS section),so you can call that from your own C(++) code. This creates a wrapper that has no metadata on it (or rather it has metadata but  the metadata indicates that no special info is known about the vector).

>
> I was trying to create a factor that used an ALTREP integer, but
> attempting to set the class and levels attributes always ended up
> duplicating and materializing the integer vector. Using the wrapper avoided
> this issue.
>
> Here is my initial ALTREP integer vector:
>
> > fc0 <- factor(c("a", "a", "b"))
> >
> > y <- matter::as.matter(as.integer(fc0))
> > y <- matter:::as.altrep(y)
> >
> > .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)
>
> Here is what I get without a wrapper:
>
> > fc1 <- structure(y, class="factor", levels=levels(x))
> > .Internal(inspect(fc1))
> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> ATTRIB:
>   @7fb0ce771868 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Here is what I get with a wrapper:
>
> > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
> levels=levels(x))
> > .Internal(inspect(fc2))
> @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
>   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> mem=0)
> ATTRIB:
>   @7fb0ce764668 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Is there a way to do this that doesn’t rely on .Internal() and won’t
> produce R CMD check warnings?
>
> ~~~
> Kylie Ariel Bemis
> Khoury College of Computer Sciences
> Northeastern University
> kuwisdelu.github.io<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192834656&sdata=y%2F9QS%2B%2B5BV16kYaHD1U4luNjIv%2F0q4KIhupAH%2FeJIe4%3D&reserved=0><https://kuwisdelu.github.io<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192834656&sdata=JnYrgz3NrgaYbkGSYwnDvIUhzf7DTsqph%2FKy15t%2BLZ4%3D&reserved=0>>
>
>
>
>
>
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org<mailto:R-devel using r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192844664&sdata=u5KvounmbXv%2ByahC7JLDzR4GMBPmds7dPcwx%2F01WLt8%3D&reserved=0>
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel using r-project.org<mailto:R-devel using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192854672&sdata=WpQxQv%2F4fcX6KbUKoACYHx8vcPsNyVZh%2BWL0dejrXeY%3D&reserved=0>


	[[alternative HTML version deleted]]



More information about the R-devel mailing list