[Rd] By default, `names<-` alters S4 objects

John Chambers jmc at r-project.org
Tue May 17 22:15:10 CEST 2011



On 5/17/11 9:53 AM, Hervé Pagès wrote:
> On 11-05-17 09:04 AM, John Chambers wrote:
>> One point that may have been unclear, though it's surprising if so. The
>> discussion was about assigning names to S4 objects from classes that do
>> NOT have a formal "names" slot. Of course, having a "names" slot is not
>> illegal, it's what one should do to deal with names in S4.
>
> IMO it looks more like what one should avoid to do right now because
> it's broken (as reported previously):
>
>  > setClass("A", representation(names="character"))
>  > a <- new("A")
>  > names(a) <- "K"
>  > names(a)
> NULL
>
> And on that particular issue here is what you said:
>
> You set up a names slot in a non-vector. Maybe that should be
> allowed, maybe not.
>
> And now:
>
> Of course, having a "names" slot is not illegal, it's what one
> should do to deal with names in S4.
>
> ??!]

Good grief.  The classes like namedList _are_ vectors, that's the point.

Anyway, this is a waste of time.  I will add some code to r-devel that 
checks S4 objects when assigning names.  People can try it out on their 
examples.

>
> H.
>
>
>> Look at class
>> "namedList" for example.
>>
>> Assigning names() to such a class would go through without warning as it
>> does now.
>>
>> > getClass("namedList")
>> Class "namedList" [package "methods"]
>>
>> Slots:
>>
>> Name: .Data names
>> Class: list character
>>
>> Extends:
>> Class "list", from data part
>> Class "vector", by class "list", distance 2
>>
>> Known Subclasses: "listOfMethods"
>> > xx <- new("namedList", list(a=1,b=2))
>> > names(xx)
>> [1] "a" "b"
>> > names(xx) <- c("D", "E")
>> > xx at names
>> [1] "D" "E"
>> >
>>
>> There was no question of breaking inheritance.
>>
>> On 5/16/11 4:13 PM, Hervé Pagès wrote:
>>> On 11-05-16 01:53 PM, John Chambers wrote:
>>>>
>>>>
>>>> On 5/16/11 10:09 AM, Hervé Pagès wrote:
>>>>> On 11-05-16 09:36 AM, John Chambers wrote:
>>>>>> You set up a names slot in a non-vector. Maybe that should be
>>>>>> allowed,
>>>>>> maybe not. But in any case I would not expect the names()
>>>>>> primitive to
>>>>>> find it, because your object has a non-vector type ("S4").
>>>>>
>>>>> But the names<-() primitive *does* find it. So either names() and
>>>>> names<-() should both find it, or they shouldn't. I mean, if you care
>>>>> about consistency and predictability of course.
>>>>
>>>> That's not the only case where borderline or mistaken behavior is
>>>> caught
>>>> on assignment, but not on access. The argument is that assignment can
>>>> afford to check things, but access needs to be fast. Slot access is
>>>> another case. There, assignment ensures legality so access can be
>>>> quick.
>>>>
>>>> The catch is that there are sometimes backdoor ways to assignments,
>>>> partly because slots, attributes and some "builtin" properties like
>>>> names overlap.
>>>>
>>>> What we were talking about before was trying to evolve a sensible rule
>>>> for assigning names to S4 objects. Let's try to discuss what people
>>>> need
>>>> to do before carping or indulging in sarcasm.
>>>
>>> What *you* were talking about but not what my original post was about.
>>> Anyway, about the following proposal:
>>>
>>> 1. If the class has a vector data slot and no names slot, assign the
>>> names but with a warning.
>>>
>>> 2. Otherwise, throw an error.
>>>
>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>
>>> I personally don't like it because it breaks inheritance. Let's
>>> say I have a class B with a vector data slot and no names slot.
>>> According to 1. names<-() would work out-of-the-box on it (with
>>> a warning), but now if I extend it by adding a names slot, it
>>> breaks.
>>>
>>> One thing to consider though is that this works right now (and with
>>> no warning):
>>>
>>> > setClass("I", contains="integer")
>>> [1] "I"
>>> > i <- new("I", 1:4)
>>> > names(i) <- LETTERS[1:4]
>>> > attributes(i)
>>> $class
>>> [1] "I"
>>> attr(,"package")
>>> [1] ".GlobalEnv"
>>>
>>> $names
>>> [1] "A" "B" "C" "D"
>>>
>>> > names(i)
>>> [1] "A" "B" "C" "D"
>>>
>>> and it's probably what most people would expect (sounds reasonable
>>> after all). So this needs to keep working (with no warning). I can
>>> see 2 ways to avoid breaking inheritance:
>>>
>>> (a) not allow a names slot to be added to class I or any
>>> of its subclasses (in other words the .Data and names
>>> slots cannot coexist),
>>> or
>>> (b) have names() and names<-() keep working when the names slot is
>>> added but that is maybe dangerous as it might break C code that
>>> is trying to access the names, that is, inheritance might break
>>> but now at the C level
>>>
>>> Now for classes that don't have a .Data slot, they can of course
>>> have a names slot. I don't have a strong opinion on whether names()
>>> and names<-() should access it by default, but honestly that's really
>>> a very small convenience offered to the developer of the class. Also,
>>> for the sake of consistency, the same would need to be done for dim,
>>> dimnames and built-in attributes in general. And also that won't work
>>> if those built-in-attributes-made-slots are not declared with the right
>>> type in the setClass statement (i.e. "character" for names, "integer"
>>> for dim, etc...). And also by default names() would return character(0)
>>> and not NULL. So in the end, potentially a lot of complications /
>>> surprise / inconsistencies for very little value.
>>>
>>> Thanks,
>>> H.
>>>
>>>>
>>>> John
>>>>
>>>>>
>>>>> H.
>>>>>
>>>>>
>>>>>> You could do
>>>>>> a at names if you thought that made sense:
>>>>>>
>>>>>>
>>>>>> > setClass("A", representation(names="character"))
>>>>>> [1] "A"
>>>>>> > a <- new("A")
>>>>>> > a at names <- "xx"
>>>>>> > a at names
>>>>>> [1] "xx"
>>>>>> > names(a)
>>>>>> NULL
>>>>>>
>>>>>>
>>>>>> If you wanted something sensible, it's more like:
>>>>>>
>>>>>> > setClass("B", representation(names = "character"), contains =
>>>>>> "integer")
>>>>>> [1] "B"
>>>>>> > b <- new("B", 1:5)
>>>>>> > names(b) <- letters[1:5]
>>>>>> > b
>>>>>> An object of class "B"
>>>>>> [1] 1 2 3 4 5
>>>>>> Slot "names":
>>>>>> [1] "a" "b" "c" "d" "e"
>>>>>>
>>>>>> > names(b)
>>>>>> [1] "a" "b" "c" "d" "e"
>>>>>>
>>>>>> This allows both the S4 and the primitive code to deal with a
>>>>>> well-defined object.
>>>>>>
>>>>>> John
>>>>>>
>>>>>>
>>>>>> On 5/15/11 3:02 PM, Hervé Pagès wrote:
>>>>>>> On 11-05-15 11:33 AM, John Chambers wrote:
>>>>>>>> This is basically a case of a user error that is not being caught:
>>>>>>>
>>>>>>> Sure!
>>>>>>>
>>>>>>> https://stat.ethz.ch/pipermail/r-devel/2009-March/052386.html
>>>>>> ......
>>>>>>
>>>>>>>
>>>>>>> Ah, that's interesting. I didn't know I could put a names slot in my
>>>>>>> class. Last time I tried was at least 3 years ago and that was
>>>>>>> causing
>>>>>>> problems (don't remember the exact details) so I ended up using
>>>>>>> NAMES
>>>>>>> instead. Trying again with R-2.14:
>>>>>>>
>>>>>>> > setClass("A", representation(names="character"))
>>>>>>>
>>>>>>> > a <- new("A")
>>>>>>>
>>>>>>> > attributes(a)
>>>>>>> $names
>>>>>>> character(0)
>>>>>>>
>>>>>>> $class
>>>>>>> [1] "A"
>>>>>>> attr(,"package")
>>>>>>> [1] ".GlobalEnv"
>>>>>>>
>>>>>>> > names(a)
>>>>>>> NULL
>>>>>>>
>>>>>>> > names(a) <- "K"
>>>>>>>
>>>>>>> > attributes(a)
>>>>>>> $names
>>>>>>> [1] "K"
>>>>>>>
>>>>>>> $class
>>>>>>> [1] "A"
>>>>>>> attr(,"package")
>>>>>>> [1] ".GlobalEnv"
>>>>>>>
>>>>>>> > names(a)
>>>>>>> NULL
>>>>>>>
>>>>>>> Surprise! But that's another story...
>>>>>>>
>>>>>>>>
>>>>>>>> The modification that would make sense would be to give you an
>>>>>>>> error in
>>>>>>>> the above code. Not a bad idea, but it's likely to generate more
>>>>>>>> complaints in other contexts, particularly where people don't
>>>>>>>> distinguish the "list" class from lists with names (the "namedList"
>>>>>>>> class).
>>>>>>>>
>>>>>>>> A plausible strategy:
>>>>>>>> 1. If the class has a vector data slot and no names slot, assign
>>>>>>>> the
>>>>>>>> names but with a warning.
>>>>>>>>
>>>>>>>> 2. Otherwise, throw an error.
>>>>>>>>
>>>>>>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>>>>>
>>>>>>> Or, at a minimum (if no consensus can be reached about the above
>>>>>>> strategy), not add a "names" attribute set to NULL. My original
>>>>>>> post was more about keeping the internal representation of objects
>>>>>>> "normalized", in general, so identical() is more likely to be
>>>>>>> meaningful.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> H.
>>>>>>>
>>>>>>>>
>>>>>>>> Comments?
>>>>>>>>
>>>>>>>> John
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> H.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-devel at r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>



More information about the R-devel mailing list