[R] Plotting from different data sources on the same plot (with ggplot2)

jiho jo.irisson at gmail.com
Mon Oct 1 07:51:44 CEST 2007


This was meant to be sent on the list:

On 2007-September-30  , at 23:12 , jiho wrote:
> On 2007-September-30  , at 21:01 , hadley wickham wrote:
>>>> [...]
>>> As expected there is nothing in the data part of the p object
>>>> p$data
>>> NULL
>>>
>>> But there is no data specification either in the layers
>>>> p$layers
>>> [[1]]
>>> geom_path: (colour=black, size=1, linetype=1) + ()
>>> stat_identity: (...=) + ()
>>> position_identity: ()
>>> mapping: ()
>>>
>>> [[2]]
>>> geom_point: (shape=19, colour=black, size=2) + ()
>>> stat_identity: (...=) + ()
>>> position_identity: ()
>>> mapping: ()
>>
>> Compare geom_point(data=mtcars) with str(geom_point(data =mtcars))
>> (which throws an error but you should be able to see enough).  So the
>> layers aren't printing out their dataset if they have one - another
>> bug.  I'll add it to my todo.
>
> I see. I did not know the `str` function. very useful.
>
>>> [...]
>>> About the other solution:
>>>
>>>>> When tinkering a bit more with this I thought that the more  
>>>>> natural
>>>>> and "ggplot" way to do it, IMHO, would be to have a new  
>>>>> addition (`
>>>>> +`) method for the ggplot class and be able to do:
>>>>>         p = p1 + p2
>>>>> and have p containing both plots, on the same scale (the union  
>>>>> of the
>>>>
>>>> You were obviously pretty close to the solution already!  - you  
>>>> just
>>>> need to remove the elements that p2 already has in common with  
>>>> p1 and
>>>> just add on the components that are different.
>>>
>>> I would love to be able to do so because this way I can define  
>>> custom
>>> plot functions that all return me a ggplot object and then combine
>>> these at will to get final plots (Ex: one function for the  
>>> coastline,
>>> another for stations coordinates, another one which gets one data
>>> value, yet another for bathymetry contours etc etc.). This modular
>>> design would be more efficient than to have to predefine all
>>> combinations in ad hoc functions (e.g. one function for coast+bathy
>>> +stations, another for coast+stations only, another for coast+bathy
>>> +stations+data1, another for... you get the point).
>>> However I don't see what to add and what to remove from the objects.
>>> Specifically, there is only "data" element in the ggplot object  
>>> while
>>> my two objects (p1 and p2) both contain something different in  
>>> $data.
>>> Should I define p$data as a list with p$data[[1]]=p1$data and p$data
>>> [[2]]=p2$data?
>>
>> You can do this already :
>>
>> sample <- c(geom_point(data = coast), geom_path(data = streams),  
>> coord_equal())
>> p + sample
>>
>> I think the thing you are missing is that the elements in ggplot()  
>> are
>> just defaults that can be overridden in the individual layers
>> (although the bug above means that isn't working quite right at the
>> moment).  So just specify the dataset in the layer that you are
>> adding.
>>
>> You can do things like:
>>
>> p <- ggplot(mapping = aes(x=lat, y = long)) + geom_point()
>> # no data so there's nothing to plot:
>> p
>>
>> # add on data
>> p %+% coast
>> p %+% coords
>
> That's great!
> In fact I think I found exactly what I was looking for. I can just do:
> 	p = ggplot() + coord_equal()
> 	p$aspect.ratio = 1
> to set up the plot, and then add the layers and have ggplot take  
> care of resizing and laying out everything automagically:
> 	p = p + geom_path(data=coast, mapping=aes(x=lon, y=lat))
> 	p = p + geom_point(data=coords, mapping=aes(x=lon, y=lat))
> 	p = p + geom_text(data=coords, mapping=aes(x=lon, y=lat,  
> label=station))
> 	etc...
> Oh, I love ggplot ;) !
>
>> The data is completely independent of the plot specification.   
>> This is
>> very different from the other plotting models in R, so it may take a
>> while to get your head around it.
>
> Yes, indeed. That's a completely new way of thinking (especially  
> given my MATLAB, Scilab background) but how powerful! I found the  
> whole "data mapping" concept very elegant but did not grasp all the  
> flexibility behind it. I wonder how mainstream it can get since so  
> many people are used to an other graphics paradigm.
>
> Anyway, I just need to define a new geom_arrow now, to plot wind  
> velocities arrows at several locations, and I'll be a happy man. Is  
> there a specific reason why '...' arguments are not passed to grid  
> functions or is it just to keep the complexity under control? I am  
> thinking in particular that:
> 	p = ggplot(coords) + geom_segment(mapping=aes(x=lon, y=lat,  
> xend=lon+0.03 ,yend=lat+-0.02), arrow=arrow(length=unit 
> (0.1,"inches")))
> would do exactly what I want provided that the 'arrow' argument is  
> passed on to segmentsGrob which is used in geom_segment.

JiHO
---
http://jo.irisson.free.fr/



More information about the R-help mailing list