[Rd] By default, `names<-` alters S4 objects

Hervé Pagès hpages at fhcrc.org
Tue May 17 07:00:18 CEST 2011


On 11-05-16 04:13 PM, Hervé Pagès wrote:
> On 11-05-16 01:53 PM, John Chambers wrote:
>>
>>
>> On 5/16/11 10:09 AM, Hervé Pagès wrote:
>>> On 11-05-16 09:36 AM, John Chambers wrote:
>>>> You set up a names slot in a non-vector. Maybe that should be allowed,
>>>> maybe not. But in any case I would not expect the names() primitive to
>>>> find it, because your object has a non-vector type ("S4").
>>>
>>> But the names<-() primitive *does* find it. So either names() and
>>> names<-() should both find it, or they shouldn't. I mean, if you care
>>> about consistency and predictability of course.
>>
>> That's not the only case where borderline or mistaken behavior is caught
>> on assignment, but not on access. The argument is that assignment can
>> afford to check things, but access needs to be fast. Slot access is
>> another case. There, assignment ensures legality so access can be quick.
>>
>> The catch is that there are sometimes backdoor ways to assignments,
>> partly because slots, attributes and some "builtin" properties like
>> names overlap.
>>
>> What we were talking about before was trying to evolve a sensible rule
>> for assigning names to S4 objects. Let's try to discuss what people need
>> to do before carping or indulging in sarcasm.
>
> What *you* were talking about but not what my original post was about.
> Anyway, about the following proposal:
>
> 1. If the class has a vector data slot and no names slot, assign the
> names but with a warning.
>
> 2. Otherwise, throw an error.
>
> (I.e., I would prefer an error throughout, but discretion ....)
>
> I personally don't like it because it breaks inheritance. Let's
> say I have a class B with a vector data slot and no names slot.
> According to 1. names<-() would work out-of-the-box on it (with
> a warning), but now if I extend it by adding a names slot, it
> breaks.
>
> One thing to consider though is that this works right now (and with
> no warning):
>
>  > setClass("I", contains="integer")
> [1] "I"
>  > i <- new("I", 1:4)
>  > names(i) <- LETTERS[1:4]
>  > attributes(i)
> $class
> [1] "I"
> attr(,"package")
> [1] ".GlobalEnv"
>
> $names
> [1] "A" "B" "C" "D"
>
>  > names(i)
> [1] "A" "B" "C" "D"
>
> and it's probably what most people would expect (sounds reasonable
> after all). So this needs to keep working (with no warning). I can
> see 2 ways to avoid breaking inheritance:
>
> (a) not allow a names slot to be added to class I or any
> of its subclasses (in other words the .Data and names
> slots cannot coexist),
> or
> (b) have names() and names<-() keep working when the names slot is
> added but that is maybe dangerous as it might break C code that
> is trying to access the names, that is, inheritance might break
> but now at the C level
>
> Now for classes that don't have a .Data slot, they can of course
> have a names slot. I don't have a strong opinion on whether names()
> and names<-() should access it by default, but honestly that's really
> a very small convenience offered to the developer of the class. Also,
> for the sake of consistency, the same would need to be done for dim,
> dimnames and built-in attributes in general. And also that won't work
> if those built-in-attributes-made-slots are not declared with the right
> type in the setClass statement (i.e. "character" for names, "integer"
> for dim, etc...). And also by default names() would return character(0)
> and not NULL. So in the end, potentially a lot of complications /
> surprise / inconsistencies for very little value.

But again (sorry to insist), it's not as important IMO what the
final rules for assigning names to S4 object are, than having
consistent rules. Consistent not only between names<-() and names(),
but also across "builtin" properties. For example (and focusing on
S4 objects with no .Data slot here):

   > setClass("A", representation(aa="integer"))
   [1] "A"
   > a <- new("A")
   > names(a) <- "K"
   > dim(a) <- 3:2
   Error in dim(a) <- 3:2 : invalid first argument
   > levels(a) <- letters[1:3]

Why dim<- returned an error but not levels<- or names<-?

   > attributes(a)
   $aa
   integer(0)

   $class
   [1] "A"
   attr(,"package")
   [1] ".GlobalEnv"

   $names
   [1] "K"

   $levels
   [1] "a" "b" "c"

   > names(a)
   NULL

   > levels(a)
   [1] "a" "b" "c"

Why levels() was able to see its attribute but not names()?

   > attr(a, "names")
   [1] "K"

   > attr(a, "levels")
   [1] "a" "b" "c"

Why isn't attr(a, "names") equivalent to names(a)?

   > setClass("B", representation(bb="integer",
                                  names="character",
                                  dim="integer"))
   [1] "B"

   > b <- new("B")

   > attr(b, "bb") <- 11:16

   > b at bb
   [1] 11 12 13 14 15 16

   > attr(b, "names") <- LETTERS[1:6]

   > names(b)
   NULL

   > b at names
   [1] "A" "B" "C" "D" "E" "F"

   > b at dim <- 3:2

   > dim(b)
   [1] 3 2

   > attr(b, "dim")
   [1] 3 2

   > attr(b, "dim") <- 5:4
   Error in attr(b, "dim") <- 5:4 : invalid first argument

Why is the dim slot an exception to the "slots are treated
as attributes" rule?

Thanks,
H.


>
> Thanks,
> H.
>
>>
>> John
>>
>>>
>>> H.
>>>
>>>
>>>> You could do
>>>> a at names if you thought that made sense:
>>>>
>>>>
>>>> > setClass("A", representation(names="character"))
>>>> [1] "A"
>>>> > a <- new("A")
>>>> > a at names <- "xx"
>>>> > a at names
>>>> [1] "xx"
>>>> > names(a)
>>>> NULL
>>>>
>>>>
>>>> If you wanted something sensible, it's more like:
>>>>
>>>> > setClass("B", representation(names = "character"), contains =
>>>> "integer")
>>>> [1] "B"
>>>> > b <- new("B", 1:5)
>>>> > names(b) <- letters[1:5]
>>>> > b
>>>> An object of class "B"
>>>> [1] 1 2 3 4 5
>>>> Slot "names":
>>>> [1] "a" "b" "c" "d" "e"
>>>>
>>>> > names(b)
>>>> [1] "a" "b" "c" "d" "e"
>>>>
>>>> This allows both the S4 and the primitive code to deal with a
>>>> well-defined object.
>>>>
>>>> John
>>>>
>>>>
>>>> On 5/15/11 3:02 PM, Hervé Pagès wrote:
>>>>> On 11-05-15 11:33 AM, John Chambers wrote:
>>>>>> This is basically a case of a user error that is not being caught:
>>>>>
>>>>> Sure!
>>>>>
>>>>> https://stat.ethz.ch/pipermail/r-devel/2009-March/052386.html
>>>> ......
>>>>
>>>>>
>>>>> Ah, that's interesting. I didn't know I could put a names slot in my
>>>>> class. Last time I tried was at least 3 years ago and that was causing
>>>>> problems (don't remember the exact details) so I ended up using NAMES
>>>>> instead. Trying again with R-2.14:
>>>>>
>>>>> > setClass("A", representation(names="character"))
>>>>>
>>>>> > a <- new("A")
>>>>>
>>>>> > attributes(a)
>>>>> $names
>>>>> character(0)
>>>>>
>>>>> $class
>>>>> [1] "A"
>>>>> attr(,"package")
>>>>> [1] ".GlobalEnv"
>>>>>
>>>>> > names(a)
>>>>> NULL
>>>>>
>>>>> > names(a) <- "K"
>>>>>
>>>>> > attributes(a)
>>>>> $names
>>>>> [1] "K"
>>>>>
>>>>> $class
>>>>> [1] "A"
>>>>> attr(,"package")
>>>>> [1] ".GlobalEnv"
>>>>>
>>>>> > names(a)
>>>>> NULL
>>>>>
>>>>> Surprise! But that's another story...
>>>>>
>>>>>>
>>>>>> The modification that would make sense would be to give you an
>>>>>> error in
>>>>>> the above code. Not a bad idea, but it's likely to generate more
>>>>>> complaints in other contexts, particularly where people don't
>>>>>> distinguish the "list" class from lists with names (the "namedList"
>>>>>> class).
>>>>>>
>>>>>> A plausible strategy:
>>>>>> 1. If the class has a vector data slot and no names slot, assign the
>>>>>> names but with a warning.
>>>>>>
>>>>>> 2. Otherwise, throw an error.
>>>>>>
>>>>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>>>
>>>>> Or, at a minimum (if no consensus can be reached about the above
>>>>> strategy), not add a "names" attribute set to NULL. My original
>>>>> post was more about keeping the internal representation of objects
>>>>> "normalized", in general, so identical() is more likely to be
>>>>> meaningful.
>>>>>
>>>>> Thanks,
>>>>> H.
>>>>>
>>>>>>
>>>>>> Comments?
>>>>>>
>>>>>> John
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> H.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>
>>>
>
>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list