[R] Data Simulation in R

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.ac.be
Wed Jan 19 13:56:37 CET 2005


Hi Harold,

maybe you could try something like this:

int1 <- slo1 <- int2 <- slo2 <- numeric(N)
for(i in seq(N)){
    Data <- as.data.frame(cbind(1:10, mvrnorm(n=sample.size, mu, 
Sigma)))
    Data$V6 <- Data$V2
    Data$V7 <- Data$V3 + vl.error[i,1]
    Data$V8 <- Data$V4 + vl.error[i,2]
    Data$V9 <- Data$V5 + vl.error[i,3]
    long <- reshape(Data, idvar="Data$V1", 
varying=list(c(names(Data)[2:5]), c(names(Data)[6:9])),
                    v.names=c("score.1", "score.2"), direction="long")
    rm(Data); gc()
    ##
    fm1 <- gls(score.1~time-1, data=long, 
correlation=corAR1(form=~1|V1), method='ML')
    int1[i] <- fm1$coefficient[1]; slo1[i] <- fm1$coefficient[2]
    rm(fm1); gc()
    ##
    fm2 <- gls(score.2~time-1, data=long, 
correlation=corAR1(form=~1|V1), method='ML')
    int2[i] <- fm2$coefficient[1]; slo2[i] <- fm2$coefficient[2]
    rm(fm2, long); gc()
}

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat
     http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Doran, Harold" <HDoran at air.org>
To: "Uwe Ligges" <ligges at statistik.uni-dortmund.de>
Cc: <r-help at stat.math.ethz.ch>
Sent: Wednesday, January 19, 2005 1:36 PM
Subject: RE: [R] Data Simulation in R


> Thanks. But, I think I am doing that. I use rm() and gc() as the 
> code
> moves along. The datasets are stored as a list. Is there a way that 
> I
> can save the list object and call each dataset within a list one at 
> a
> time, or must the entire list be in memory at once?
>
> Harold
>
> -----Original Message-----
> From: Uwe Ligges [mailto:ligges at statistik.uni-dortmund.de]
> Sent: Wednesday, January 19, 2005 5:51 AM
> To: Doran, Harold
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Data Simulation in R
>
> Doran, Harold wrote:
>
>> Dear List:
>>
>> A few weeks ago I posted some questions regarding data simulation 
>> and
>> received some very helpful comments, thank you. I have modified my
>> code accordingly and have made some progress.
>>
>> However, I now am facing a new challenge along similar lines. I am
>> attempting to simulate 250 datasets and then run the data through a
>> linear model. I use rm() and gc() as I move along to clean up the
>> workspace and preserve memory. However, my aim is to use sample 
>> sizes
>> of 5,000 and 10,000. By any measure this is a huge task.
>>
>> My machine has 2GB RAM and a Pentium 4 2.8 GHz machine with Windows
> XP.
>> I have the following in the "target" section of the Windows 
>> shortcut
>> --max-mem-size=1812M
>>
>> With such large samples, R is unable to perform the analysis, at 
>> least
>
>> with the code I have developed. It halts when it runs out of 
>> memory. A
>
>> collegue subsequently constructed the simulation using another
>> software program with a similar computer and, while it took over 
>> night
>
>> (and then some), the program produced the results desired.
>>
>> I am curious if it is the case that such large simulations are out 
>> of
>> the grasp of R or if my code is not adequately organized to perform
>> the simulation.
>>
>> I would appreciate any thoughts or advice.
>
>
> Don't hold all datasets (and results, if they are big) in the memory 
> at
> the same time!!!
>
> Either generate them when you use them and delete them afterwards, 
> or
> save them to disc an only load one by one for further analyses.
> Also, you might want to call gc() after you removed large objects...
>
> Uwe Ligges
>
>
>
>> Harold
>>
>>
>>
>> library(MASS)
>> library(nlme)
>> mu<-c(100,150,200,250)
>> Sigma<-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4
>> ,4
>> )
>> mu2<-c(0,0,0)
>> Sigma2<-diag(64,3)
>> sample.size<-5000
>> N<-250 #Number of datasets
>> #Take a single draw from VL distribution vl.error<-mvrnorm(n=N, 
>> mu2,
>> Sigma2)
>>
>> #Step 1 Create Data
>> Data <- lapply(seq(N), function(x)
>> as.data.frame(cbind(1:10,mvrnorm(n=sample.size, mu, Sigma))))
>>
>> #Step 2 Add Vertical Linking Error
>> for(i in seq(along=Data)){
>> Data[[i]]$V6 <- Data[[i]]$V2
>> Data[[i]]$V7 <- Data[[i]]$V3 + vl.error[i,1]
>> Data[[i]]$V8 <- Data[[i]]$V4 + vl.error[i,2]
>> Data[[i]]$V9 <- Data[[i]]$V5 + vl.error[i,3] }
>>
>> #Step 3 Restructure for Longitudinal Analysis long <- lapply(Data,
>> function(x) reshape(x, idvar="Data[[i]]$V1",
>> varying=list(c(names(Data[[i]])[2:5]),c(names(Data[[i]])[6:9])),
>> v.names=c("score.1","score.2"), direction="long"))
>>
>> #####################
>> #Clean up Workspace
>> rm(Data,vl.error)
>> gc()
>> #####################
>>
>> # Step 4 Run GLS
>>
>> glsrun1 <- lapply(long, function(x) gls(score.1~I(time-1), data=x,
>> correlation=corAR1(form=~1|V1), method='ML'))
>>
>> # Extract intercepts and slopes
>> int1 <- sapply(glsrun1, function(x) x$coefficient[1])
>> slo1 <- sapply(glsrun1, function(x) x$coefficient[2])
>>
>> ################
>> #Clean up workspace
>> rm(glsrun1)
>> gc()
>>
>> glsrun2 <- lapply(long, function(x) gls(score.2~I(time-1), data=x,
>> correlation=corAR1(form=~1|V1), method='ML'))
>>
>> # Extract intercepts and slopes
>> int2 <- sapply(glsrun2, function(x) x$coefficient[1])
>> slo2 <- sapply(glsrun2, function(x) x$coefficient[2])
>>
>>
>> #Clean up workspace
>> rm(glsrun2)
>> gc()
>>
>>
>>
>> # Print Results
>>
>> cat("Original Standard Errors","\n", "Intercept","\t",
>> sd(int1),"\n","Slope","\t","\t", sd(slo1),"\n")
>>
>> cat("Modified Standard Errors","\n", "Intercept","\t",
>> sd(int2),"\n","Slope","\t","\t", sd(slo2),"\n")
>>
>> rm(list=ls())
>> gc()
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list