[R] Clean up a scatterplot with too much data

Paul Hiemstra paul.hiemstra at knmi.nl
Tue Aug 2 15:11:01 CEST 2011


 On 08/02/2011 01:07 PM, Dennis Murphy wrote:
> In addition to the other responses (all of which I liked), a couple of
> other alternatives to consider are 2D density plots (see ?kde2d in the
> MASS package, for example) or geom_tile() in the ggplot2 package,
> which you can think of as a 3D histogram projected to 2D with color
> corresponding to (relative) frequency, as suggested by Paul Hiemstra.
> geom_tile() is a discretized, gridded version of a hexbin plot, but I

When using geom_tile you need to bin the data yourself. I much prefer
using stat_bin2d which does all the work for you.

cheers,
Paul

> would start with the hexbin myself. I echo KOH's comment: make sure
> you remove the outliers first, especially that one in the upper left
> corner :)
>
> After looking at your plot, here's my question: why would you plot
> kills/minute vs. minutes played? Doesn't the first variable render the
> second one moot? Wouldn't kills vs. minutes played be a more relevant
> (scatter)plot? If you have information on the skill level of the
> players, you could incorporate that information into the plot as well.
> There are several nice ways to go if this is the case.
>
> If kills/minute is the more appropriate measure, a univariate density
> plot would make sense, or a histogram.
>
> HTH,
> Dennis
>
> On Mon, Aug 1, 2011 at 10:26 PM, DimmestLemming <NICOADAMS000 at gmail.com> wrote:
>> I'm working with a lot of data right now, but I'm new to R, and not very good
>> with it, hence my request for help. What type of graph could I use to
>> straighten out things like...
>>
>> http://r.789695.n4.nabble.com/file/n3711389/Untitled.png
>>
>> ...this?
>>
>> I want to see general frequencies. Should I use something like a 3D
>> histogram, or is there an easier way like, say, shading? I'm sure these are
>> both possible, but I don't know which is easiest or how to implement either
>> of them.
>>
>> Thanks!
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770



More information about the R-help mailing list