[Rd] Best style to organize code, namespaces

Gabor Grothendieck ggrothendieck at gmail.com
Wed Feb 24 04:07:16 CET 2010


That's quite nice.  I see you did post about it last September when it
was added to the devel version of R:
http://www.mail-archive.com/r-devel@r-project.org/msg17708.html

Relative to this discussion what would be useful would be a facility
that debugs not only a function but also all embedded functions in it
so that, say

debug(outer, recursive = TRUE)

would turn on debug for both outer and inner.  In fact, recursive
might be the default in which case one would specify recursive = FALSE
to only debug the outer function.

Also it would be useful to extend that to environments so that
debug(e) would debug all functions directly in environment e.


On Tue, Feb 23, 2010 at 5:42 PM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> On 22/02/2010 10:15 PM, Gabor Grothendieck wrote:
>>
>> As you mention ease of debugging basically precludes subfunctions so
>> style 1 is left.
>
> I think you are also unaware of setBreakpoint() (and trace(), which
> underlies it).  So more detail:
>
> If you put this into file test.R, starting on line 1:
>
> outer <- function(x) {
>   inner <- function() {
>       if (x > 3) {
>          x <- 0
>       }
>   }
>
>   inner()
>   print(x)
> }
>
> you might want to debug why outer(10) prints 10, not 0.  So set a breakpoint
> on line 4 to see if you get there:
>
>> setBreakpoint("test.R#4")
> c:\temp\test.R#4:
>  outer step 2,3,3,2,3,2 in <environment: R_GlobalEnv>
>> outer(10)
> test.R#4
> Called from: eval(expr, envir, enclos)
> Browse[1]>
>
> Yes, we got there.  Take a step:
>
> Browse[1]> n
> debug: x <- 0
>
> Now recognize we should have used x <<- 0.
>
> Now, it would be nice if we had more flexible debugging (e.g. single
> stepping that would stay as single stepping when we exited from inner()),
> but debug() is certainly not the only possibility for debugging.  It's not
> even the best choice in a lot of situations where it does work.
>
> Duncan Murdoch
>
>>
>> Functions can be nested in environments rather than in other functions
>> and this will allow debugging to still occur.
>>
>> The proto package which makes it particularly convenient to nest
>> functions in environments giving an analog to #3 while still allowing
>> debugging.  See http//:r-proto.googlecode.com
>>
>>> library(proto)
>>> # p is proto object with variable a and method f
>>> p <- proto(a = 1, f = function(., x = 1) .$a <- .$a + 1)
>>> with(p, debug(f))
>>> p$f()
>>
>> debugging in: get("f", env = p, inherits = TRUE)(p, ...)
>> debug: .$a <- .$a + 1
>> Browse[2]>
>> exiting from: get("f", env = p, inherits = TRUE)(p, ...)
>> [1] 2
>>>
>>> p$a
>>
>> [1] 2
>>
>>
>> On Mon, Feb 22, 2010 at 9:49 PM, Ben <misc7 at emerose.org> wrote:
>>>
>>> Hi all,
>>>
>>> I'm hoping someone could tell me what best practices are as far as
>>> keeping programs organized in R.  In most languages, I like to keep
>>> things organized by writing small functions.  So, suppose I want to
>>> write a function that would require helper functions or would just be
>>> too big to write in one piece.  Below are three ways to do this:
>>>
>>>
>>> ################### Style 1 (C-style) ###############
>>> Foo <- function(x) {
>>>  ....
>>> }
>>> Foo.subf <- function(x, blah) {
>>>  ....
>>> }
>>> Foo.subg <- function(x, bar) {
>>>  ....
>>> }
>>>
>>> ################### Style 2 (Lispish?) ##############
>>> Foo <- function(x) {
>>>  Subf <- function(blah) {
>>>   ....
>>>  }
>>>  Subg <- function(bar) {
>>>   ....
>>>  }
>>>  ....
>>> }
>>>
>>> ################### Object-Oriented #################
>>> Foo <- function(x) {
>>>  Subf <- function(blah) {
>>>   ....
>>>  }
>>>  Subg <- function(bar) {
>>>   ....
>>>  }
>>>  Main <- function() {
>>>   ....
>>>  }
>>>  return(list(subf=subf, subg=subg, foo=foo))
>>> }
>>> ################### End examples ####################
>>>
>>> Which of these ways is best?  Style 2 seems at first to be the most
>>> natural in R, but I found there are some major drawbacks.  First, it
>>> is hard to debug.  For instance, if I want to debug Subf, I need to
>>> first "debug(Foo)" and then while Foo is debugging, type
>>> "debug(Subf)".  Another big limitation is that I can't write
>>> test-cases (e.g. using RUnit) for Subf and Subg because they aren't
>>> visible in any way at the global level.
>>>
>>> For these reasons, style 1 seems to be better than style 2, if less
>>> elegant.  However, style 1 can get awkward because any parameters
>>> passed to the main function are not visible to the others.  In the
>>> above case, the value of "x" must be passed to Foo.subf and Foo.subg
>>> explicitly.  Also there is no enforcement of code isolation
>>> (i.e. anyone can call Foo.subf).
>>>
>>> Style 3 is more explicitly object oriented.  It has the advantage of
>>> style 2 in that you don't need to pass x around, and the advantage of
>>> style 1 in that you can still write tests and easily debug the
>>> subfunctions.  However to actually call the main function you have to
>>> type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
>>> function for this.  Either way there is more typing.
>>>
>>> So anyway, what is the best way to handle this?  R does not seem to
>>> have a good way of managing namespaces or avoiding collisions, like a
>>> module system or explicit object-orientation.  How should we get
>>> around this limitation?  I've looked at sample R code in the
>>> distribution and elsewhere, but so far it's been pretty
>>> disappointing---most people seem to write very long, hard to
>>> understand functions.
>>>
>>> Thanks for any advice!
>>>
>>> --
>>> Ben
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list