[R] Any "special interest" in R/pic interface?

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Sun Aug 5 14:01:49 CEST 2007


Hi Folks,

I'm wondering if there are people out there who would
be interested in what would be involved in developing
an interface between R graphics and the 'pic' language.

Explanation; 'pic' has been part of the Unix 'troff'
typesetting suite since very early days (1970s), and also
of the GNU troff: 'groff'. Its function is to act as a
preprocessor, translating textual descriptions of graphical
displays into the formatting language used by troff.

Example:

.PS
## Need x- and y-scale factors to exist before referring to them
xsc=1.0 ; ysc=1.0
## Define the basic graphics object: the histogram bar
##   uses positional parameters $1, 42, $3, $4 in the data line
define bar {
  box width ($2 - $1)*xsc height $3*ysc \
  with .sw at ($1*xsc,0) fill $4
}
## Draw the basic histogram
xsc=1.0 ; ysc=0.75
copy thru bar until "EOT"
-2.5   2.5 31.0 0.25
 2.5   7.5 69.0 0.25
 7.5  12.5 50.0 0.25
12.5  17.5  0.0 0.25
17.5  22.5  0.0 0.25
22.5  27.5  0.0 0.25
27.5  32.5  0.0 0.25
32.5  37.5  1.0 0.25
37.5  42.5  1.0 0.25
42.5  47.5  0.0 0.25
EOT
.PE

which will draw the histogram bars, each with a black border,
and grey-filled at "grey level" 0.25. The above is readily
entered by hand (with the data copied from R or imported from
a file). But it is a very simple example.

Each bar is a 'pic' "box" object, with width equal to the
difference between its upper and lower breakpoints (so
variable-width can be accomodated), scaled by factor 'xsc'.
The placement of the box is set by specifying that its
SouthWest corner ".sw" is at (x,0) where 'x' is the lower
of the two breakpoints for the box.

When a histogram (one of my histograms, "MH", in this case)
has been constructed by

MH <- hist(..., plot=FALSE)

the first two columns above are available as MH$breaks[1:10]
and MH$breaks[2:11].

The height of each box is set to the value of the 3rd column,
scaled by factor 'ysc'; the values in the 3rd column are
available in MH$counts. The 4th column (grey-level) is whatever
you like.

The whole array can be constructed in R using a simple cbind(),
and it is easy to see how to write an R routine which would
output the whole of the above code block to a file, which could
then be copied into a troff source document (which is ASCII
text throughout).

The above example is of course the bare basics. You would need
to add extra 'pic' statements to draw the axes, with coordinate
values as annotations (this is a straightforward loop in 'pic').

You could amend the code to cause the count-values (when non-zero)
to be placed on the tops of the bars as follows:

define bar {
  box width ($2 - $1)*xsc height $3*ysc \
  with .sw at ($1*xsc,0) fill $4
  if($3>0) then {
    sprintf("%.0f",$3) "" at top of last box
  }
}

The "top" of the box is the midpoint of its top side, and the
function sprintf("%.0f",$3) does what R users would expect,
producing a text object which is then stacked on top of the
empty text object "", the whole being vertically and horizontally
centred at the "top of the box" (this ensures that the visible
text is just above the top side; otherwise it would be vertically
centred on, i.e. cut by, this line).

You can readily add further annotations; and adjust features of
annotations at pleasure-- e.g. to set them in 2-point smaller size
than the point-size for the main text, and in italic type, you could
modify the sprintf() to read:

  sprintf("\s-2\fI%.0f\\fP\s0",$3)

(\s-2 makes the type 2 points smaller; \s0 restores the previous
point size; \fI switches to Italic style, \fP switches back to the
Previous style). You can easily do other things, e.g. rotated text.

Such a segment of 'pic' language can either be incorporated into
a troff document, within the main text of the document (so would
appear as a Figure in the text), or could be a self-contained troff
document which could be used to generate an EPS (Encapsulate
PostScript) file which could then be imported into any document
(or supplied as a stand-alone Figure to a journal publisher who
accepts stand-alone Figures in this format; etc).

I've given a simple example, to illustrate (a) getting data
from R into 'pic' and (b) some of the detailed control one has
over the layout and appearance of the result.

However, my real interest here is in more complicated graphics
which can be generated by R, such as the Trellis Plots in lattice
graphics.

For instance, the example for xyplot():

xyplot(
  Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, 
  data = iris, allow.multiple = TRUE, scales = "free",
  layout = c(2, 2),
  auto.key = list(x = .6, y = .7, corner = c(0, 0))
)

lies well within the capabilitis of 'pic'. What one would need, to
make this kind of thing generic, would be

a) 'pic' code to specify the basic primitives of the plot
   (analagous to the "define bar{...}" in my histogram example)
b) Layout information
c) Acess to the numerical (and any textual) information which
   determines the position and value of plotted objects
c) Export to a file which can then be readily edited to "tune" the
   appearance of the result to the user's satisfaction (as in the
   choice of size and style for the printed histogram counts).

I'm well aware of R's xfig() device, which exports to XFig files
which the xfig program can in turn export to 'pic' language.

However, the resulting 'pic' code is horrendous, and essentially
impossible to edit for a diagram of any complexity (it's near
impossible for a mere scatter-plot of 2 variables). The 'pic'
code written by a human is much more transparent and easy to
edit (see my histogram example), and is the kind of objective
I have in mind. I envisage that in complex plots produced by R,
such as the above xyplot, one can obtain the generic information
on needs by

XY<-xyplot(....)
str(XY)
then extracting XY$whatever, and wrapping this stuff in 'pic' code.


Looking forward to hearing if anyone is interested in joining in
an exploration of this territory!

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 05-Aug-07                                       Time: 13:01:40
------------------------------ XFMail ------------------------------



More information about the R-help mailing list