[R] Pdf file size for very scatter plots

Ben Bolker bolker at ufl.edu
Fri Aug 15 22:00:28 CEST 2008


Nazareno Andrade <nazareno <at> lsd.ufcg.edu.br> writes:

> 
> Jim,
> 
> Thanks for the answer. Using pch="." reduces the file to ~3MB... Still large.
> 
> I'll look into hexbins, but if I understand it right, it would 'round'
> points which are nearby into a same hexagon, right? Couldn't that
> result in an inaccurate view of a scatter plot?
> 
> Here's the code I'm using:
> 
> pdf(); plot(rnorm(1e5), rnorm(1e5), pch = "."); dev.off()
> 
> thanks again,
> Nazareno
> 
> On Fri, Aug 15, 2008 at 12:27 PM, jim holtman <jholtman <at> gmail.com> wrote:
> > Have you tried using  pch='.'?
> >
> > Also you might consider using 'hexbin' for creating the scatter plot.
> >
> > On Fri, Aug 15, 2008 at 12:24 PM, Nazareno Andrade
> > <nazareno <at> lsd.ufcg.edu.br> wrote:
> >> Dear all,
> >>
> >> I am plotting a scatter plot for a large sample (1e+05 ordered pairs).
> >> This produces a large (~5MB) file in a pdf or postscript terminal, and
> >> I am wondering whether there are methods for reducing the size of the
> >> resulting file so that it is easier to include it in a document. I'd
> >> rather stick with pdf or ps as I am using latex.
> >>
> >> thanks,
> >> Nazareno
> >>

    You can embed png in a LaTeX file if you want: google "latex png".
The problem is that lots of these points overlap, and they're all
going into the PDF file whether or not they're visible in the plot
or not.

  I just did

png(file="a.png",height=2000,width=2000)
...
dev.off()

which created a 100K file with resolution much higher than
that of my screen, on which every distinct point seems to be visible.
I suppose you could try to thin the data set by figuring
out which points are exactly on top of each other at a given
resolution (by rounding, pasting columns together and looking
for duplicates, or perhaps by using hexbin at a ridiculously
high resolution), but the PNG solution seems much easier.

  good luck,
   Ben Bolker



More information about the R-help mailing list