[R] Memory Problems with a Simple Bootstrap

Tom La Bone booboo at gforcecable.com
Fri Aug 1 18:09:42 CEST 2008



I have a data file called inputdata.csv that looks something like this"

          ID     Year    Result	Month	Date
1	7174    1954   10            3          540301
2	7174    1954    4            3          540322
3	20924  1967     4           2          670223
4	20924  1967   -7            5          670518
5	20924  1967   -3            7          670706
...
67209 ...

i.e., it goes on for 67209 rows (~2 Mb file). When I run the following
bootstrap session I get the indicated error:

> 
> library(boot)
> setwd("C:/Documents and Settings/Tom/Desktop")   
> 
> data.in <- read.csv("inputdata.csv",header=T,as.is=T)
> 
> per95 <- function( annual.data, b.index) {
+   sample.data <- annual.data[b.index,]
+   return(quantile(sample.data$Result,probs=c(0.95))) }
> 
> m <- 10000
> for (i in 1:39) {
+   annual.data <- data.in[data.in$Year == (i+1949),]
+   B <- boot(data=annual.data,statistic=per95,R=m)
+   print(i)
+   print(memory.size())
+ }
[1] 1
[1] 20.26163
[1] 2
[1] 61.6352
[1] 3
[1] 134.4187
[1] 4
[1] 149.4704
[1] 5
[1] 290.3090
[1] 6
[1] 376.7017
[1] 7
[1] 435.7683
[1] 8
[1] 463.7404
[1] 9
[1] 497.7946
Error: cannot allocate vector of size 568.8 Mb
> 

I am running this on a Windows XP Pro machine with 4 Gb of memory. The same
problem occurs when the code is executed on the same box running Ubuntu
8.04. Does anyone see any obvious reason why this should run out of memory?
I would be happy to email the data file to anyone who cares to try it on
their computer.

Tom


 


-- 
View this message in context: http://www.nabble.com/Memory-Problems-with-a-Simple-Bootstrap-tp18777897p18777897.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list