[R] Plotting from different data sources on the same plot (with ggplot2)

jiho jo.irisson at gmail.com
Thu Sep 27 12:52:18 CEST 2007


Hello everyone (and Hadley in particular),

I often need to plot data from multiple datasets on the same graph. A  
common example is when mapping some values: I want to plot the  
underlying map and then add the points. I currently do it with base  
graphics, by recording the maximum region in which my map+point will  
fit, plotting both with these xlim and ylim parameters, adding par 
(new=T) between plot calls and setting the graphical parameters (to  
draw axes, titles, to set aspect ratio) by hand. This is not easy nor  
practical when the plots become more and more complicated.

The ggplot book specifies that "[ggplot] makes it easy to combine  
data from multiple sources". Since I use ggplot2 as much as I can  
(thanks it's really really great!) I thought I would try producing  
such a plot with ggplot2.

NB: If this is possible/easy with an other plotting package please  
let me know. I am not looking for something specific to maps but  
rather for a generic mechanism to throw several pieces of data to a  
graph and have the plotting routine take care of setting up axes that  
will fit all data on the same scale.

So, now for the ggplot2 part. I have two data sources: the  
coordinates of the coastlines in a region of interest and the  
coordinated of sampling stations in a subset of this region. I want  
to plot the coastline as a line and the stations as points, on the  
same graph. I can plot them independently easily:

p1 = ggplot(coast,aes(x=lon,y=lat)) + geom_path() + coord_equal(ratio=1)
p1$aspect.ratio = 1

p2 = ggplot(coords,aes(x=lon,y=lat)) + geom_point() + coord_equal 
(ratio=1)
p2$aspect.ratio = 1

but I cannot find how to combine the two graphs. I suspect this has  
probably to be done via different layers but I really can't find how.  
In particular, I would like to know how to deal with the scales: can  
ggplot take care of plotting the two datasets on the same coordinates  
system or do I have to manually record the maximal range of x and y  
and force ggplot to use this on both layers, as I did with base  
graphics? (of course I would prefer the former ;) ).

To test it further with real data, here is my code and data:
	http://jo.irisson.free.fr/dropbox/test_ggplot2.zip

A small additional precision: I would like the two datasets to stay  
separated. Indeed I could probably combine them and plot everything  
in one step by clever use of ggplot arguments. However this is just a  
simple example and I would like to add more in the future (like  
trajectories at each station, points proportional to some value at  
each station etc.) so I really want the different data sources to be  
separated and to produce the plot in several steps, otherwise it will  
soon become too complicated to manage.

Thank you very much in advance for your help.

JiHO
---
http://jo.irisson.free.fr/



More information about the R-help mailing list