[R] FW: R memory management

Patrick Burns pburns at pburns.seanet.com
Sat Dec 8 18:47:58 CET 2007


The line:

  data. <- c(data., new.data)

will eat both memory and time voraciously.

You should change it by creating 'data.' to
be the final size it will be and then subscript
into it.  If you don't know the final size, then
you can grow it a lot a few times instead of
growing it a little lots of times.


Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Yuri Volchik wrote:

>Hi,
>
> 
>
>I'm using R to collect data for a number of exchanges through a socket
>connection and constantly running into memory problems even though task I
>believe is not that memory consuming. I guess there is a miscommunication
>between R and WinXP about freeing up memory.
>
>So this is the code:
>
> 
>
>for (x in 1:length(exchanges.to.get)) {
>
>   tickers<-sqlQuery(channel,paste("SELECT Symbol FROM symbols_list WHERE
>Exchange='",exchanges.to.get[x],"';",sep=''))[,1]
>
>   dir.create(paste(Working.dir,exchanges.to.get[x],'/',sep=''))
>
>   for (y in 1:length(tickers)) {
>
>     con2 <- socketConnection(Sys.info()["nodename"], port = ****)  #open
>socket connection to get data
>
>     writeLines(paste(command,',',tickers[y],',',interval,';',sep=''), con2)
>
>     data.<-readLines(con2)
>
>     end.of.data<-sum(c(data.=="!ENDMSG!",data.=="!SYNTAX_ERROR!"))
>
>     while(end.of.data!=1)
>{new.data<-readLines(con2);end.of.data<-sum(new.data=="!ENDMSG!");
>data.<-c(data.,new.data)}
>
>     if (length(data.)>3)
>write.table(data.[1:(length(data.)-2)],paste(Working.dir,exchanges.to.get[x]
>,'/',sub('\\*','\+',tickers[y]),'_.csv',sep=''),quote=F,col.names =
>F,row.names=F)
>
>     close(con2)
>
>   }
>
>  rm(tickers)
>
>  gc()
>
> 
>
> 
>
>With command  gcinfo(TRUE) I got the following info (some examples) :
>
> 
>
>Garbage collection 16362 = 15411+754+197 (level 0) ... 
>
>6.3 Mbytes of cons cells used (22%)
>
>2.2 Mbytes of vectors used (8%)
>
> 
>
>Garbage collection 16407 = 15454+756+197 (level 0) ... 
>
>13.1 Mbytes of cons cells used (46%)
>
>10.4 Mbytes of vectors used (39%)
>
> 
>
>Garbage collection 16410 = 15456+756+198 (level 2) ... 
>
>4.9 Mbytes of cons cells used (21%)
>
>0.9 Mbytes of vectors used (4%)
>
> 
>
>Garbage collection 16679 = 15634+796+249 (level 0) ... 
>
>150.7 Mbytes of cons cells used (95%)
>
>203.9 Mbytes of vectors used (75%)
>
> 
>
>Garbage collection 16680 = 15634+796+250 (level 2) ... 
>
>4.9 Mbytes of cons cells used (4%)
>
>0.9 Mbytes of vectors used (0%)
>
> 
>
>Garbage collection 16808 = 15754+802+252 (level 0) ... 
>
>6.1 Mbytes of cons cells used (7%)
>
>1.8 Mbytes of vectors used (1%)
>
> 
>
>But the end result is in Task Manager:
>
>RGui.exe  Mem Usage 470,472K  VM Size 541,988K
>
> 
>
>Even though R reports 
>
>Garbage collection 16808 = 15754+802+252 (level 0) ... 
>
>6.1 Mbytes of cons cells used (7%)
>
>1.8 Mbytes of vectors used (1%)
>
> 
>
>Has anybody encountered this problem and how you guys deal with it?  It
>seems like a memory leak to me, as tasks are not memory demandind, the
>biggest amount of data in a single file is about 40MB.
>
> 
>
>Thanks
>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>  
>



More information about the R-help mailing list