[R] R design (was "Variable passed to function not used in function in select)

Duncan Murdoch murdoch at stats.uwo.ca
Tue Nov 11 17:12:03 CET 2008


On 11/11/2008 10:28 AM, Terry Therneau wrote:
>  I've read the back and forth this morning, and I have to side with Vince.
>  
>  1. Functions that re-interpret their arguments are very dangerous.  The 
> original question involved a well formed call to a function, which returned the 
> wrong answer.  Bug, design flaw, whatever -- it's a mistake and the best choice 
> would be to fix it.
>   I only consider such behavior in 2 cases:
>   	a. when the function is almost never, ever, called from anything but the 
> top level.  help() is the only example I can think of.
>   	b. to create a label from an argument, as in plot, but the argument 
> itself is left alone to work as it should.

There's another major use for this:  model formulas.  I like to be able 
to write lm(y ~ ., data=df), and I'd really hate to have to evaluate all 
the terms in a model formula explicitly.

> One possible fix for subset: first treat the argument formally, and only if that 
> simple interpretation fails try the more 'clever' interpretations.  Whether this 
> is doable or not I can't say.  	


>   2. The documentation of subset is not in any way clear.  I would never have 
> been able to diagnose or work around this bug.  The issues are very subtle.  
>   I quite often see "it's in the manual so we bear no blame" as an argument on 
> this list.  We all need to remember that our view of what we are particularly 
> close to is a distorted one -- I for instance think that everything about the 
> survival package is crystal clear --- and be particularly open to concerns that 
> something is opaque or subtle.
>   
>   3. I've heavily used perhaps 20 computing languages in my life.  I found S to 
> be a refreshing revalation (referring to S of the 1988 Blue manual) precisely 
> because it was completely functional.  Once I got used to it, this feature made 
> it so much more useful, extensible, understandable than other things I'd used.

I don't know your definition of "completely functional", but I don't 
think S and R have ever been.  It has always been possible to refer to 
non-local variables within a function (and their meaning is different 
between S and R, but I think R tends to be a bit more functional in 
this), to make super-assignments, to do lots of things that have side 
effects.


>    R is becoming less and less a functional language (hidden functions and 
> dependencies with environments for one), I quite often cannot figure out either 
> exactly what a function calls or how to get it to stop doing it.  

I don't know what you mean here.  Are you talking about recent changes? 
  (Which ones?)  Or are you talking about older things, like namespaces? 
  Or closures, which have been in R from the beginning (and which are 
part of why I'd call it more functional than S)?


I am not sure
> we have gained with each choice of "convenience" or sophistication over 
> functional purity.  I want "scan(file=myfile)" to continue to return "variable 
> myfile not found" when I forget the quotes.

R allows a lot of flexibility in how arguments are handled, and there's 
been some experimentation with different variations.  Remember that R is 
partly a laboratory in which people are trying to invent new ways of 
doing statistical computing, and also remember that R (including its 
contributed packages) has hundreds of authors, not all of whom agree on 
the best way to do things.  The benefit of this is that more stuff gets 
done:  I'm not forced to adopt your ideas of The Right Way to Do Things, 
so I can get down to coding in the way I like. The disadvantage is that 
things can be inconsistent, so people are forced to read the 
documentation, and the documentation is always imperfect.

>      
>    I am stumped by the R results I get too often, and I'm not a novice.  That 
> said, good design is hard.  I spend a lot of time on that aspect in the survival 
> package and there are still bits where the 'right' way is only clear after 
> several years experience.  I do occassionaly make non-backwards compatable 
> changes.  The R core team has done an amazing job on the whole.

If I'm not mistaken, you are still an S user as well as an R user, and 
  this is a bit of a disadvantage:  at a fundamental level, they are 
different languages, though they look superficially similar.  I haven't 
used S in quite a few years, so I expect I'd be stumped by the results I 
got there in a lot of cases.  I think that in the main R is a simpler, 
easier language to understand, but there are certainly bits and pieces 
of it where it is not easy.

>    And let's not shoot the bearers of bad news.

I think we can discuss what's good and what's bad about the language 
without bringing out the guns or insults.

Duncan Murdoch



More information about the R-help mailing list