[R] Graphics and LaTeX documents with the same font

hadley wickham h.wickham at gmail.com
Sat Sep 29 17:41:07 CEST 2007


On 9/29/07, hadley wickham <h.wickham at gmail.com> wrote:
> On 9/29/07, Michael Friendly <friendly at yorku.ca> wrote:
> >
> > hadley wickham wrote:
> > >>
> > > I was interested to see that you have code for drawing scatterplots
> > > with multiple y-axes.  As far as I know the only legitimate use for a
> > > double-axis plot is to confuse or mislead the reader (and this is not
> > > a very ethical use case).  Perhaps you have a counter-example?
> > >
> > > Hadley
> > >
> > While it is true that the double-Y-axis graph is generally considered
> > sinful, it can be used effectively to show the relation of two time
> > series in ways that other graphs can't do as well.
> >
> > For one striking example,
> > a political, presentation graphic, see:
> > http://www.math.yorku.ca/SCS/Gallery/images/commonsenserevolution6.pdf
> > described on my Graphical Excellence page,
> > http://www.math.yorku.ca/SCS/Gallery/excellence.html
> > I found it easy to excuse the sin by the 'wow effect' produced by the
> > graph.
>
> While I agree that the double y-axis plot can be used to compare two
> time series, I'm not sure whether or not it actually is effective.
> The appearance of the display is so critically dependent on the
> relative scales of the axes, that it is easy to draw the wrong
> conclusion.  Why not use a scatterplot or path plot (i.e. connect
> subsequent observations with edges) if you want to understand the
> relationship between two variables?

To compare the scatterplot vs double axis plot, I used graphclick
(http://www.arizona-software.ch/graphclick/) to digitise the graphic,
to get the following dataset:

csr <- structure(list(year = c(1985, 1986, 1987, 1988, 1989, 1990, 1991,
1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
2003, 2004, 2005, 2006), deaths = c(1, 1, 7, 5, 12, 3, 7, 5,
4, 6, 8, 19, 26, 20, 42, 41, 45, 41, 27, 52, 67, 50), income = c(NA,
8572, NA, NA, 9264, 10071, 10338, 10687, 10666, 10666, 9907,
8141, 8059, 7997, 7874, 7648, 7484, 7319, 7135, 7135, 7011, NA
)), .Names = c("year", "deaths", "income"), row.names = c(NA,
-22L), class = "data.frame")

and produce the attached graphic (I'm not sure if the attachment will
make it to r-help, but the code should be reproducible on any system):

library(ggplot2)
ggplot(csr, aes(x=deaths, y=income)) +
geom_path(colour="grey80") + geom_point()

# or without connecting lines
ggplot(csr, aes(x=deaths, y=income)) + geom_point()

I find this graph much easier to interpret - one can see outliers, the
suggestion of non-linearity etc.  It would also be easy to add the
political party with colour or shape.

I'm not sure if it's a good idea to include the line or not - the
gestalt principle of connectedness makes it very difficult to
interpret the points as separate objects even when the line connecting
them is so faint.

Hadley

-- 
http://had.co.nz/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: csr-scatterplot.pdf
Type: application/pdf
Size: 8458 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20070929/475fdd8f/attachment.pdf 


More information about the R-help mailing list