[Rd] common base functions stripping S3 class

Tue Nov 18 14:15:32 CET 2014

On 17/11/2014, 4:23 PM, Murat Tasan wrote:
> Yeah, my biggest stumbling-point while starting to write S3 classes
> was the some-default-methods-preserve class, and
> some-default-methods-don't-preserve class dichotomy.
> But I'm not sure it's so "easy" to figure this out without more
> documentation... (though my experience is n = 1, and I might be
> particularly slow).

What I meant is that you can just try it.  If you think your users will
want to subset your object, then you can try it yourself, and you'll see
that you need to write a `[` method.

Duncan Murdoch

> 
> The most common motivating example for S3 classes (I've seen) is
> overriding plot().
> I imagine many people would want to take a base structure (e.g. a
> simple vector) and 'class-ify' it solely for the purposes of
> encapsulating domain-specific plotting commands:
> 
> MyClass <- function(x) structure(x, class = "MyClass")
> plot.MyClass <- function(...) ## large complicated plotting function here.
> 
> Those examples, however, basically never mention the need to then
> override/implement many other common methods, `c`, `[`, `unique`,
> `as.list`, `as.data.frame`, etc.
> I believe this is a _huge_ tripping point for new-comers to R
> programming (even if they are not new-comers to programming more
> generally).
> In my own experience, I had to work backwards by finding methods that
> dropped my class, then examine the source for those methods, find the
> underlying calls in those methods that dropped the class, and continue
> on down the (rabbit hole) call stack... this is hardly ideal for any
> programmer, I think, experienced or novice.
> 
> In the end, I completely understand your point (e.g. with the sorted
> numbers example), and I don't know how to resolve the issue, save
> perhaps for more explicit warnings when introducing S3 programming?
> 
> My own solution, by the way, is to define a single ancestor class that
> either (i) errors immediately if some assumptions fail, or (ii)
> dispatch to the default method while working to properly restore class
> attributes of the return object.
> Most of my 'useful' classes inherit from this 'dummy' ancestor class,
> just to save a lot of re-writing dispatch code.
> An example of where I error-out immediately is something like `c`,
> where I'll check to make sure all args are of the same class type...
> if they aren't, I could use R's coercion rules, but I've opted for the
> 'type-safe' approach of mixing variables when dealing with my own
> custom classes.
> An example of where I opt for preserving class is `[`.
> If I write a class where subsetting doesn't make sense, I'll have to
> write a fail-fast implementation of `[` for that specific class.
> The whole thing seems... inelegant (for lack of a better word), which
> is what prompted my post in the first place.
> 
> Cheers, and thanks for the discussion and points... they're definitely
> helpful in guiding development.
> 
> -murat
> 
> 
> On Mon, Nov 17, 2014 at 9:19 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> On 17/11/2014 10:41 AM, Hadley Wickham wrote:
>>>
>>>> Generally the idea is that the class should be stripped because R has no
>>>> way of knowing if the new object, for example unique(obj), still has the
>>>> necessary properties to be considered to be of the same class as obj.
>>>> Only the author of the class knows that.  S4 would help a bit here, but
>>>> only structurally (it could detect when the object couldn't possibly be
>>>> of the right class), not semantically.
>>>
>>> There are two possible ways that S3 methods could handle subclasses:
>>>
>>> * preserve by default (would also have preserve all attributes)
>>> * drop by default
>>>
>>> If you could really on either system consistently, I think you could
>>> write correct code. It's very hard when the defaults vary.
>>>
>>> (In other words, I agree with everything you said, except I think if
>>> the default was to preserve you could still write correct code)
>>
>>
>> I don't see how default preserving could work.
>>
>> For example, I might define a "SortedNumbers" class, which is a vector of
>> numbers in non-decreasing order.  I could define min() and max() methods for
>> it which would be really fast, because they only need to look at the first
>> or last elements.  But a rev() method wouldn't make sense, so I wouldn't
>> define one of those.
>>
>> If the rev() default method left the class as "SortedNumbers", then my min()
>> and max() calculations would end up broken.
>> So maybe I should have defined a rev() method that just stops with an error.
>> But classes don't own methods, so I'd have no way of knowing that someone
>> else defined a new generic (e.g. shuffle()) that broke things.  I don't see
>> any way around this within the S3 system.
>>
>> In fact, some default methods do preserve the class, for example the
>> replacement method `[<-`.  I could take a SortedNumbers vector of the
>> numbers 1:10, and set element 1 to 11, and end up breaking min() and max().
>> This is a problem with the current design.
>>
>> Probably we should do a better job of documenting which methods preserve the
>> class and which ones don't.  (For example, `[` doesn't preserve the class,
>> even though it would be fine to do so in this example.)  But there are a lot
>> of things to do, and this is one thing that is pretty easy to figure out
>> without documentation, so I'd say it's a low priority.
>>
>> Duncan Murdoch
>>
>>