[Rd] Please explain your workflow from R code -> package -> R code -> package

Fri Sep 9 20:10:07 CEST 2011

On 9/9/2011 10:47 AM, Duncan Murdoch wrote:
> On 09/09/2011 12:38 PM, Paul Johnson wrote:
>> Hi,
>>
>> I'm asking another one of those questions that would be obvious if I
>> could watch your work while you do it.
>>
>> I'm having trouble understanding the workflow of code and package 
>> maintenance.
>>
>> Stage 1.  Make some R functions in a folder.  This is in a Subversion 
>> repo
>>
>> R/trunk/myproject
>>
>> Stage 2. Make a package:
>>
>> After the package.skeleton, and R check, I have a new folder with the
>> project in it,
>>
>> R/trunk/myproject/mypackage
>>    DESCRIPTION
>>    man
>>    R
>>
>> I to into the man folder and manually edit the Rd files. I don't
>> change anything in the R folder because I think it is OK so far.
>>
>> And eventually I end up with a tarball mypackage_1.0.tar.gz.
>>
>> Stage 3. How to make the round trip? I add more R code, and
>> re-generate a package.
>>
>> package.skeleton obliterates the help files I've already edited.
>
> You should only run it once.  After that, add your code by editing *.R 
> files in the R directory, sourcing them, and generate *.Rd files using 
> prompt().  As Dirk said, run R CMD check when you think you're done, 
> and it will point out how wrong you are.
>> So keeping the R code in sync with the documentation appears to be a 
>> hassle.
>
> If you write the *.Rd file before (like Spencer) or soon after writing 
> the code, then design errors will usually stick out at you, and you 
> can modify the functions.  If you keep your functions small, you'll 
> get them working early, and won't have a lot of problems keeping them 
> in sync with the docs, because they won't change much once you get 
> them right.

For me, the benefits are huge:  I believe I tripled my software 
development productivity almost overnight when I started writing 
documentation with examples (unit tests) before writing the code.  Then 
I run "R CMD check" after every tiny change.  This may seem like extra 
work, but it saves debugging time, because any new problems are likely 
restricted to what I changed.  For example, I write a function A.  Then 
I write B.  Then I write C.  In the process of writing C, I change A.  R 
CMD check after adding C reveals that the change to A broke B.  Without 
the R package discipline, it could easily be a year before a found that 
a bug existed, and then it was an enormous effort to find and fix it.  
(See Wikipedia, "Software repository", "Package development process".)  
In addition to having better code is less time for myself, I can easily 
share the results with others -- thereby increasing my productivity 
substantially more than the factor of three I mentioned.

Spencer

> Duncan Murdoch
>
>> In other languages, I've seen to write the documentation inside the
>> code files and then post-process to make the documentation.  Is there
>> a similar thing for R, to unify the R code development and
>> documentation/package-making process?
>>
>> pj
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>