[R] Computations slow in spite of large amounts of RAM.

Huiqin Yang Huiqin.Yang at noaa.gov
Tue Jul 1 15:55:39 CEST 2003


Hi all,

I am a beginner trying to use R to work with large amounts of
oceanographic data, and I find that computations can be VERY slow.  In
particular, computational speed seems to depend strongly on the number
and size of the objects that are loaded (when R starts up).  The same
computations are significantly faster when all but the essential
objects are removed.  I am running R on a machine with 16 GB of RAM,
and our unix system manager assures me that there is memory available
to my R process that has not been used.

1.  Is the problem associated with how R uses memory?  If so, is there
some way to increase the amount of memory used by my R process to get
better performance?

The computations that are particularly slow involve looping with
by().  The data are measurements of vertical profiles of pressure,
temperature, and salinity at a number of stations, which are organized
into a dataframe p.1 (1925930 rows, 8 columns: id, p, t, and s, etc.),
and the objective is to get a much smaller dataframe and the unique 
values for ID is 1409 with the minimum and maximum pressure for each
profile.  The slow part is:

h.maxmin <- by(p.1,p.1$id,function(x){
             data.frame(id=x$id[1],
                      maxp=max(x$p),
                      minp=min(x$p))})

2.  Even with unneeded data objects removed, this is very slow.  Is
there a faster way to get the maximum and minimum values?

platform sparc-sun-solaris2.9
arch     sparc               
os       solaris2.9          
system   sparc, solaris2.9   
status                       
major    1                   
minor    7.0                 
year     2003                
month    04                  
day      16                  
language R             

Thank you for your time.

Helen




More information about the R-help mailing list