[R] Tables with Graphical Representations

Frank E Harrell Jr f.harrell at vanderbilt.edu
Fri Sep 1 17:06:29 CEST 2006


(Ted Harding) wrote:
> On 31-Aug-06 Sam Ferguson wrote:
>> Hi useRs -
>>
>> I was wondering if anyone out there can tell me where to find
>> R-code to do mixes of tables and graphics. I am thinking of
>> something similar to this:
>> http://yost.com/information-design/powerpoint-corrupts/
>> or like the excel routines people are demonstrating:
>> http://infosthetics.com/archives/2006/08/excel_in_cell_graphing.html
>>
>> My aim is to provide small graphics to illustrate numbers directly  
>> beside or behind their position in the table. Maybe there is a way
>> to do it with lattice?
>>
>> Thanks for any help you may be able to provide.
>> Sam Ferguson
> 
> I dare say there may be a way to do that kind of thing directy within R,
> and if so then the graphics experts will no doubt tell us how!
> 
> But your examples are just one kind of combined tabular/graphic layout
> (and somewhat similar to each other). In a more general context of
> combining tables of numerical results with graphic displays, it is
> perhaps better to think in terms of using R to produce the numerical
> results in the first instance, and then handing these over to software
> designed for general-purpose graphical/textual layout. You then have
> complete control, and full flexivility of design.
> 
> Indeed, in your second (Excel) example, the method of production is
> just a nasty kludge -- and it was a happy coincidence that the "REPT"
> function was available in Excel at all!
> 
> As Frank Harrell has just posted (just as I was completing this one!),
> you can do this sort of thing in LaTex (his example shows little
> histograms of the data, above each different tabular section). LaTex
> is an example of software which allows you to create precisely formatted
> graphics within precisely formatted text.
> 
> However, I'm no expert on LaTex, preferring what I've been used to for
> too many years, namely Unix 'troff' and its more recent GNU implementation
> 'groff'.
> 
> As a preliminary, you will need to get R to output a suitable data
> file, or a suitably composed data file with 'groff' formatting tags
> interspersed. The latter should not be difficult, though my own approach
> would be to simply take a data file of the form (for your first example
> as taken from your URL):
> 
> "% survival / standard error" "5 year" "10 year" "15 year" "20 year"
> "Prostate" 98.8 0.4 95.2 0.9 87.1 1.7 81.3 3.0
> "Thyroid" 96.0 0.8 95.8 1.2 94.0 1.6 95.4 2.1
> "Testis" 94.7 1.1 94.0 1.3 91.1 1.8 88.2 2.3
> [...]
> 
> (which would be very straightforward in R) and then use say 'awk'
> to compute 'groff' data with embedded tags (see below).
> 
> The file which I would then submit to 'groff' would look like
> 
> 
> 
> .ds RED "\X'ps: exec 1 0 0 setrgbcolor'
> .ds GREY "\X'ps: exec 0.5 0.5 0.5 setrgbcolor'
> .ds BLACK "\X'ps: exec 0 0 0 setrgbcolor'
> .ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\
> \*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\
> '\
> \Z'\
> \h'\\$1p'\
> \*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
> '\h'0.5i'\
> \v'-0.2m'\*[BLACK]
> .LP
> .TS
> box tab(#);
> c3 s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s.
> 
> \f[BMB]\s[15]Estimated survival rates by cancer site\s0\fP
> 
> .T&
> l c s s s s s s s s s s s.
> #\fB\s[12]% survival / standard error\s0\fP
> #\_
> .T&
> l c s s c s s c s s c s s.
> #5 year#10 year#15 year#20 year
> #\_#\_#\_#\_
> .T&
> l  n l n n c n n c n n c n.
> Prostate#98.8#\*[bx 35.6]#0.4#95.2#\*[bx 34.3]#0.9#87.1#\
> \*[bx 31.4]#1.7#81.3#\*[bx 29.3]#3.0
> Thyroid#96.0#\*[bx 34.6]#0.8#95.8#\*[bx 34.5]#1.2#94.0#\
> \*[bx 33.8]#1.6#95.4#\*[bx 34.3]#2.1
> Testis#94.7#\*[bx 34.1]#1.1#94.0#\*[bx 33.8]#1.3#91.1#\
> \*[bx 32.8]#1.8#88.2#\*[bx 31.8]#2.3
> [...]
> Pancreas#4.0#\*[bx 1.4]#0.5#3.0#\*[bx 1.1]#1.5#2.7#\
> \*[bx 1.0]#0.6#2.7#\*[bx 1.0]#0.8
> 
> .TE
> 
> 
> 
> The key here is to define a "parametrised string" which will
> be invoked as "\*[bx <number>]". The is the main "embedded tag".
> 
> Each box is 0.5 inch wide (36 points), and consists of a lefthand
> section in Red which width is 36*percent/100 points, with a
> rigthand section in Grey whose width is 36*(1 - percent/100) points.
> The height of the box is 1 em (which, in points, is the point-size
> of the current font), and the box has been shifted downwards slightly
> (0.2 2m) to align it nicely with the text. The parameter "<number>"
> in "\*[bx <number>]" is the value of 36*percent/100. So this can, for
> instance, be easily computed in an 'awk' run.
> 
> The block of "code"
> 
> .ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\
> \*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\
> '\
> \Z'\
> \h'\\$1p'\
> \*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
> '\h'0.5i'\
> \v'-0.2m'\*[BLACK]
> 
> defines the tag "\*[bx ...]", which is responsible for drawing the
> graphical item ion the table wherever it is invoked. Initailly it
> is padded above an below with a bit of extra space ("\x...") and
> moved down slightly ("\v'0.2m'"), then colour changes to Red and
> a filled Red polygon is drawn; then the drawing point is shifted
> and a filled Grey polygon is drawn. Finally the colour is changed
> back to Black for the text part of the Table. The value of "<number>"
> is substituted for "\\$1" wherever this occurs in the definition
> of "bx".
> 
> The line ".TS" leads in to a Table definition, which ends with ".TE".
> The next few lines specifiy table layout (types, spacings and
> widths of columns, cell separator "#", etc.); and then come the
> data for each line of the table, in which the box tag "\*[bx ...]"
> occurs where needed. As indicated above, the full table data could
> probably be easily computed in R and can certainly be easily done
> in 'awk' or 'perl'.
> 
> After all that, the result is quite pleasing -- and, when I compare
> it with the graph shown on Sam's URL, it seems to me to represent
> the numbers much more accurately, as well as being visually slightly
> more expressive.
> 
> It would also be quite feasible to "complicate" the graphics with
> indications of SE etc., by adding more to the definition of \*[bx ...].
> 
> I have looked at the "LaTeX file produced by lstex.describe" for
> Frank Harrell's example. Granting that it has no doubt been automatically
> produced, it is enormous and, for practical purposes, uneditable if
> you want to tweak features of the display. It would be interesting
> to see what had to be down further back up the line to produce it;
> this might be, of course, much easier to tweak. On the other hand,
> my 'groff source' file above is compact and easily changed.
> 
> If anyone would like to look at the output I have produced by the
> above method (PDF file), and the full groff source file, drop me a
> line (I'll send them privately to Sam anyway).
> 
> Best wishes to all,
> Ted.
>
Ted - neat stuff - my aim is to not have to edit the LaTeX at all, i.e., 
to keep tuning the R code that produces LaTeX.  Your ideas also make me 
think of Xfig and something involving the R xfig driver.

Frank



More information about the R-help mailing list