[R] (no subject)

Tom Knockinger tomkn at gmx.at
Fri Dec 11 17:07:14 CET 2009


Hi, 
i am new to the R-project but until now i have found solutions for every problem in toturials, R Wikis and this mailing list, but now i have some problems which I can't solve with this knowledge.

I have some data like this:

# sample data
head1 = "a;b;c;d;e;f;g;h;i;k;l;m;n;o"
data1 = "1;1;1;1;1;1;1;1;1;1;1;1;1;1"
data2 = "2;2;2;2;2;2;2;2;2;2;2;2;2;2"
data3 = "3;3;3;3;3;3;3;3;3;3;3;3;3;3"
datastring = paste("", head1,data1,data2,data3,"",sep="\n")

# import operation
res = read.table(textConnection(datastring), header=TRUE, sep = c(";"))
closeAllConnections()

# I use these two lines in a for-loop like this: 
#for( j in 1:length(data)) {
#	res[j] = read.table(textConnection(datastring[j]), 
header=TRUE, sep = c(";"))
#	closeAllConnections()
#}

I get these strings from a file which contains about 50 to 1000 of them, so I can read them all into a list. I am not sure if there is a better way to do this, but it works for me. Maybe you have some suggestions for a better solution. 

Now after this short introduction to the r-program I use, I have two problems with this approach.

1) warnings
i get warnings like "unused connection 3 (datastring) closed" after some other operations from time to time. But all connections should already be closed, and I doesn't create new ones.

2) ram usage and program shutdowns
length(data) is usually between 50 to 1000. So it takes some space in ram (approx 100-200 mb) which is no problem but I use some analysis code which results in about 500-700 mb ram usage, also not a real problem. 
The results are matrixes of (50x14 to 1000x14) so they are small enough to work with them afterwards: create plots, or make some more analysis.
So i wrote a function which do the analysis one file after another and keep only the results in a list. But after some about 2-4 files my R process uses about 1500MB and then the troubles begin. The R console terminates or prints the error that no more space can be allocated. So i have to do each file separate and save each result in a file and restart R after 2 processed files. And do that 3-5 times so that all files are processed, which is a bit anoying. 

I did some research on this problem and i find out that 
-) after I import the data in the same variable the ram usage goes up each time about 100-200mb instead of reusing or purging the old data, which should be overwritten since they are no longer available after i import a new file.
-) the same occures with the analysis functions which uses much more space and also doesn't release the old no longer used variables. But ls() doesn't shows them at all.
-) also after I cleared all variables with "rm(list=ls(all=TRUE))" the used ram space is still the same.

So is there a possibility to get the ram space back? So i can do all the analysis in one session and don't have to mess around with additional files?


Thanks for your help

Tom
-- 
Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.!
http://portal.gmx.net/de/go/dsl02




More information about the R-help mailing list