[Rd] possible memory leak involving looping, optimization, and gam

Christopher Paciorek paciorek at hsph.harvard.edu
Sun Feb 1 22:20:59 CET 2009


When I run the gam function as part of an optimization and do the optimization many times using a loop, I'm finding that memory use increases over time (based on simply monitoring top).  Below is some example code that involves varying the penalty parameter in gam, trying to find the value that gives exactly 50 edf for a simple smoothing problem.  I thought I would post to the list to see if anyone had any ideas of what might be going on and if indeed this is a memory leak.

I'm running R2.8.0 (64-bit) under RHEL4 on a cluster as well as R2.8.1 under Fedora 10 on an individual machine, both Intel-based.  
The issue does not seem to occur on a Windows XP machine with R 2.6.1  (32-bit), nor on Mac OSX (Leopard) with R2.6.2.

mgcv versions are  1.4-1 for Linux and 1.3-29 for Windows and Mac.

n=700
x=runif(n)
dist=abs(matrix(x,n,n,byrow=TRUE)-matrix(x,n,n,byrow=FALSE))
Sigma=exp(-dist/0.3)
y=t(chol(Sigma))%*%rnorm(n)

psFun=function(spVal,dfWanted){
  return((summary(gam(y~s(x,k=400),sp=spVal))$edf-dfWanted)^2)
}

for(i in 1:10000){
  spVal=optimize(psFun,c(.00001,500),50)$minimum
  print(i)
}

Note that the same issue seems to arise regardless of whether I use uniroot, optimize, nlminb, or nlm to do the optimization.

-chris

----------------------------------------------------------------------------------------------
Chris Paciorek / Asst. Professor        Email: paciorek at hsph.harvard.edu
Department of Biostatistics             Voice: 617-432-4912
Harvard School of Public Health         Fax:   617-432-5619
655 Huntington Av., Bldg. 2-407         WWW: www.biostat.harvard.edu/~paciorek
Boston, MA 02115 USA                    Permanent forward: paciorek at alumni.cmu.edu



More information about the R-devel mailing list