[Rd] Wish there were a "strict mode" for R interpreter. What

(Ted Harding) ted.harding at wlandres.net
Sat Apr 9 23:08:13 CEST 2011


On 09-Apr-11 20:37:28, Duncan Murdoch wrote:
> On 11-04-09 3:51 PM, Paul Johnson wrote:
>> Years ago, I did lots of Perl programming. Perl will let you be lazy
>> and write functions that refer to undefined variables (like R does),
>> but there is also a strict mode so the interpreter will block anything
>> when a variable is mentioned that has not been defined. I wish there
>> were a strict mode for checking R functions.
>>
>> Here's why. We have a lot of students writing R functions around here
>> and they run into trouble because they use the same name for things
>> inside and outside of functions. When they call functions that have
>> mistaken or undefined references to names that they use elsewhere,
>> then variables that are in the environment are accidentally used. Know
>> what I mean?
>>
>> dat<- whatever
>>
>> someNewFunction<- function(z, w){
>>     #do something with z and w and create a new "dat"
>>     # but forget to name it "dat"
>>      lm (y, x, data=dat)
>>     # lm just used wrong data
>> }
>>
>> I wish R had a strict mode to return an error in that case. Users
>> don't realize they are getting nonsense because R finds things to fill
>> in for their mistakes.
>>
>> Is this possible?  Does anybody agree it would be good?
>>
> 
> It would be really bad, unless done carefully.
> 
> In your function the free (undefined) variables are dat and lm.  You 
> want to be warned about dat, but you don't want to be warned about lm. 
> What rule should R use to determine that?
> 
> (One possible rule would work in a package with a namespace.  In that 
> case, all variables must be found in declared dependencies, the search 
> could stop before it got to globalenv().  But it seems unlikely that 
> your students are writing packages with namespaces.)
> 
> Duncan Murdoch

I'm with Duncan on this one! On the other hand, I can understand the
issues that Paul's students might encounter.

I think the right thing to so is to introduce the students to the
basics of scoping, early in the process of learning R.

Thus, when there is a variable (such as 'lm' in the example) which
you *expect* to already be out there (since 'lm' is in 'stats'
which is pre-loaded by default), then you can go ahead and use it.

But when your function uses a variable (e.g. 'dat') which just
*happened* to be out there when you first wrote the function,
then when you re-use the same function definition in a different
context things are likely to go wrong. So teach them that variables
which occur in functions, which might have any meaning in whatever
the context of use may be, should either be named arguments in
the argument list, or should be specifically defined within the
function, and not assumed to already exist unless that is already
guaranteed in every context in which the function would be used.

This is basic good practice which, once routinely adopted, should
ensure that the right thing is done every time!

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 09-Apr-11                                       Time: 22:08:10
------------------------------ XFMail ------------------------------



More information about the R-devel mailing list