[R] Block comments in R?

Richard A. O'Keefe ok at cs.otago.ac.nz
Mon Oct 9 00:19:24 CEST 2006


I wrote:
    R documentation comments really belong
    in [.Rd] files where help() can find them.

Barry Rowlingson <B.Rowlingson at lancaster.ac.uk> replied:
	R documentation comments belong in .Rd files at the moment, but how 
	joyous would it be if they could be included in the .R files?
	
How joyous?  About as joyous as a root canal job without anaesthetic.

I used to be a fan of including thorough documentation in code.
Then JavaDoc hit the world, and many of the languages I use are following
suit with their own EDoc or PlDoc or whateverDoc mess.

I have read far more Java than I ever wanted to, and the more Java I read
the more I *hate* JavaDoc, and the more I am convinced that if you want
to mix documentation and code you need *really* sophisticated tools (like
Web in its various incarnations) or really simple tools (like Haskell's
"Bird Tracks", a notation I have adapted to my own use for several other
languages).

.Rd files are semisophisticated; if JavaDoc is a reliable guide, then
shoving that stuff into .R files would be horrible.

Do I need to point out the single biggest difference between JavaDoc
and .Rd?  Maybe I do.  .Rd files are *USEFUL*.  (Because of references,
examples, consistency checking, &c, and because they can describe closely
related GROUPS of functions rather than being nailed to specific methods.
The right documentation about the right topics at the right level of
detail.)  JavaDoc-style documentation seems to systematically encourage
bulky documentation of low utility.

	  Okay, this is all part of my incessant whining to make R more like 
	Python, but I've found managing separate .Rd and .R files a pain. If .R 
	files could have embedded documentation you'd have one source for code, 
	documentation, tests etc. I did play about with this in the Splus days, 
	attaching documentation strings to functions with attributes, but it was 
	just kludgy without a proper mechanism.
	
Let me point out that right now there is NOTHING stopping anyone mixing
.Rd and .R and test cases as they wish.  How?  Here's how:

	$poomat.1
	.TH POOMAT 1 "Oct 2006" "Version 1.0" "USER COMMANDS"
	.SH NAME
	poomat - poor man's Tangle
	.SH SYNOPSIS
	.B poomat
	file ...
	.SH DESCRIPTION
	.B poomat
	extracts several interleaved files (such as documentation,
	source code, and test cases) from a single file.  You can
	pack several files together using shell archives, tar files,
	or ZIP files, but those are distribution formats, not meant
	to be edited as single files.
	.B poomat
	lets you scatter a file in pieces, interleaved with other
	pieces, so that a function, the documentation for the function,
	and the test cases for the function can all be in one place.
	.SH OPTIONS
    	None.
	.B poomat
	concatenates its input files just like
	.IR cat (1)
	or
	.IR awk (1)
	and writes to files named in the input.
	.SH INPUT LANGUAGE
	The input to
	.B poomat
	is a sequence of chunks.  Each chunk is introduced by a line
	consisting of a dollar sign in column 1, immediately followed
	by a file name.  The first chunk for any file name creates a
	new file; remaining chunks for the same file are appended to it.
	.SH BUGS
	Unlike
	.IR tangle (1),
	.IR ctangle (1),
	and other Literate Programming tools, there is no facility for
	re-ordering chunks.  Nor is there any macro facility.
	.PP
	It is up to you to make sure that the file names are portable.
	Stick to 8.3 file names without any directory affixes and you
	should be right.
	$INSTALL
	Edit the first line of the poomat file to refer to the right
	version of awk (nawk, gawk, mawk) for your system, and then
	move the poomat file to some directory in your $PATH.
	$poomat
	#!/usr/ucb/nawk -f
	BEGIN { output = "/dev/stdout" }
	/^[$]/ { output = substr($0, 2); next }
	{ print >output }
	
Yes, I do mean the whole thing is a three-line AWK script.
Yes, I do mean it is language-independent, and doesn't NEED to be built
into R or anything else.
Yes, I do mean that the source code, documentation, and test cases get
separated as part of the build process.  So?

For people interested in doing stuff like that with C or C++, it really
doesn't take a lot of work to make poomat notice a
/[.][chCH][pxPX+]?[pxPX+]?$/ suffix and emit a #line directive.
OK, here it is:

	/^[$]/ {
	    output = substr($0, 2)
	    if (/[.][chCH][pxPX+]?[pxPX+]?$/)
		print "#line", FNR+1, "\"" FILENAME "\"" >output
	    next
	}

So your C or C++ debugger can refer back to the original file.
Maybe that should be in the official version, but R doesn't need it.



More information about the R-help mailing list