[R] PDF too large, PNG bad quality

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Thu Oct 22 22:42:31 CEST 2009


On Thu, Oct 22, 2009 at 8:28 PM, Greg Snow <Greg.Snow at imail.org> wrote:
> The problem with the pdf files is that they are storing the information for every one of your points, even the ones that are overplotted by other points.  The png file is smaller because it only stores information on which color each pixel should be, not how many points contributed to a particular pixel being a given color.  But then png files convert the text to pixel information as well which don't look good if there is post scaling.
>
> If you want to go the pdf route, then you need to find some way to reduce redundant information while still getting the main points of the plot.  With so many point, I would suggest looking at the hexbin package (bioconductor I think) as one approach, it will not be an identical scatterplot, but will convey the information (possibly better) with much smaller graphics file sizes.  There are other tools like sunflower plots or others, but hexbin has worked well for me.
>

 I've seen this kind of thing happen after waiting an hour for one of
my printouts when queued after something submitted by one of our
extreme value stats people. I've seen them make plots containing maybe
a million points, most of which are in a big black blob, but they want
to be able to show the important sixty or so points at the extremes.

 I'm not sure what the best way to print this kind of thing is - if
they know where the big blob is going to be then they could apply some
cutoff to the plot and only show points outside the cutoff, and fill
the region inside the cutoff with a black polygon...

 Another idea may be to do a high resolution plot as a PNG (think 300
pixels per inch of your desired final output) but do it without text
and add that on later in a graphics package.

Barry




More information about the R-help mailing list