[Rd] documenation duplication and proposed automatic tools

Ross Boylan ross at biostat.ucsf.edu
Mon Oct 2 21:15:58 CEST 2006


I've been looking at documenting S4 classes and methods, though I have a
feeling many of these issues apply to S3 as well.

My impression is that the documentation system requires or recommends
creating basically the same information in several places.  I'd like to
explain that, see if I'm correct, and suggest that a more automated
framework might make life easier.

PROBLEM

Consider a class A and a method foo that operates on A.  As I understand
it, I must document the generic function foo (or ?foo will not produce a
response) and the method foo (or methods ? foo will not produce a
response).  Additionally, it appears to be recommended that I document
foo in the Methods section of the documentation for class A.  Finally, I
may want to document the method foo with specific arguments
(particularly if if uses "unusual" arguments, but presumably also if the
semantics are different in a class that extends A).

This seems like a lot of work to me, and it also seems error prone  and
subject to synchronization errors.  R CMD check checks vanilla function
documentation for agreement with the code, but I'm not sure that method
documentation in other contexts gets much help.

To complete the picture, suppose there is a another function, "bar",
that operates on A.  B extends A, and reimplements foo, but not bar.

I think the suggestion is that I go back and add the B-flavored method
foo to the general methods documentation for foo.  I also have a choice
whether I should mention bar in the documentation for the class B.  If I
mention it, it's easier for the reader to grasp the whole interface that
B presents.  However, I make it harder to determine which methods
implement new functionality.

SOLUTION

There are a bunch of things users of OO systems typically want to know:
1) the relations between classes
2) the methods implemented by a class (for B, just foo)
3) the interface provided by a class (for B, foo and bar)
4) the various implementations of a particular method

All of these can be discovered dynamically by the user.  The problem is
that current documentation system attempts to reproduce this dynamic
information in static pages.  prompt, promptClass and promptMethods
functions generate templates that contain much of the information (or at
least there supposed to; they seem to miss stuff for me, for example
saying there are no methods when there are methods).  This is helpful,
but has two weaknesses.  First, the class developer must enter very
similar information in multiple places (specifically, function, methods,
and class documentation).  Second, that information is likely to get
dated as the classes are modified and extended.

I think it would be better if the class developer could enter the
information once, and the documentation system assemble it dynamically
when the user asks a question.  For example, if the user asks for
documentation on a class, the resulting page would be contstructed by
pulling together the class description, appropriate method descriptions,
and links to classes the focal class extends (as well, possibly, as
classes that extend it).  Similarly, a request for methods could
assemble a page out of the snippets documenting the individual methods,
including links to the relevant classes.

I realize that implementing this is not trivial, and I'm not necessarily
advocating it as a priority.  But I wonder how it strikes people.

-- 
Ross Boylan                                      wk:  (415) 514-8146
185 Berry St #5700                               ross at biostat.ucsf.edu
Dept of Epidemiology and Biostatistics           fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739                     hm:  (415) 550-1062




More information about the R-devel mailing list