[R] scatterplot of 100000 points and pdf file format

Witold Eryk Wolski wolski at molgen.mpg.de
Wed Nov 24 17:49:44 CET 2004


Hi,
Yes, indeed the hexbin package generates very cool pix. They look great. 
I was using it already.
But this time I am interested in visualizing exactly the _scatter_ of 
some extreme points.

Eryk

Liaw, Andy wrote:

>Marc/Eryk,
>
>I have no experience with it, but I believe the hexbin package in BioC was
>there for this purpose: avoid heavy over-plotting lots of points.  You might
>want to look into that, if you have not done so yet.
>
>Best,
>Andy
>
>  
>
>>From: Marc Schwartz
>>
>>On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote:
>>    
>>
>>>Hi,
>>>
>>>I want to draw a scatter plot with 1M  and more points and 
>>>      
>>>
>>save it as pdf.
>>    
>>
>>>This makes the pdf file large.
>>>So i tried to save the file first as png and than convert 
>>>      
>>>
>>it to pdf. 
>>    
>>
>>>This looks OK if printed but if viewed e.g. with acrobat as 
>>>      
>>>
>>document 
>>    
>>
>>>figure the quality is bad.
>>>
>>>Anyone knows a way to reduce the size but keep the quality?
>>>      
>>>
>>Hi Eryk!
>>
>>Part of the problem is that in a pdf file, the vector based 
>>instructions
>>will need to be defined for each of your 10 ^ 6 points in 
>>order to draw
>>them.
>>
>>When trying to create a simple example:
>>
>>pdf()
>>plot(rnorm(1000000), rnorm(1000000))
>>dev.off()
>>
>>The pdf file is 55 Mb in size.
>>
>>One immediate thought was to try a ps file and using the 
>>above plot, the
>>ps file was "only" 23 Mb in size. So note that ps can be more 
>>efficient.
>>
>>Going to a bitmap might result in a much smaller file, but as 
>>you note,
>>the quality does degrade as compared to a vector based image.
>>
>>I tried the above to a png, then converted to a pdf (using 'convert')
>>and as expected, the image both viewed and printed was "pixelated",
>>since the pdf instructions are presumably drawing pixels and 
>>not vector
>>based objects.
>>
>>Depending upon what you plan to do with the image, you may have to
>>choose among several options, resulting in tradeoffs between image
>>quality and file size.
>>
>>If you can create the bitmap file explicitly in the size that you
>>require for printing or incorporating in a document, that is 
>>one way to
>>go and will preserve, to an extent, the overall fixed size image
>>quality, while keeping file size small.
>>
>>Another option to consider for the pdf approach, if it does not
>>compromise the integrity of your plot, is to remove any duplicate data
>>points if any exist. Thus, you will not need what are in effect
>>redundant instructions in the pdf file. This may not be possible
>>depending upon the nature of your data (ie. doubles) without 
>>considering
>>some tolerance level for "equivalence".
>>
>>Perhaps others will have additional ideas.
>>
>>HTH,
>>
>>Marc Schwartz
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! 
>>http://www.R-project.org/posting-guide.html
>>
>>
>>    
>>
>
>
>------------------------------------------------------------------------------
>Notice:  This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message.  If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.
>------------------------------------------------------------------------------
>
>  
>


-- 
Dipl. bio-chem. Witold Eryk Wolski
MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin
tel: 0049-30-83875219                 __("<    _
http://www.molgen.mpg.de/~wolski      \__/    'v'
http://r4proteomics.sourceforge.net    ||    /   \
mail: witek96 at users.sourceforge.net    ^^     m m
      wolski at molgen.mpg.de




More information about the R-help mailing list