[R] Simulation Questions

Andrew Robinson A.Robinson at ms.unimelb.edu.au
Tue May 3 01:59:43 CEST 2011


Hi Shane,

it sounds to me as though you have a fairly well-defined problem.  You
want to generate random numbers with a specific mean, variance, and
correlation with another random varaible.  I would reverse-enginerr
the fuinctions for simple linear regression to get a result like

y = beta_0 + beta_1 * x + rnorm(n, 0, sigma^2)

and use that as the basis of generating random numbers.

Not sure how to interpret the second question ...

Cheers

Andrew

On Sun, May 01, 2011 at 12:33:41AM -0400, Shane Phillips wrote:
> I have the following script for generating a dataset.  It works like a champ except for a couple of things.
> 
> 1.  I need the variables "itbs" and  "map" to be negatively correlated with the binomial variable "lunch"  (around -0.21 and -0.24, respectively). The binomial variable  "lunch" needs to remain unchanged.
> 2.  While my generated variables do come out with the desired means and correlations, the distribution is very narrow and only represents a small portion of the possible scores.  Can I force it to encompass a wider range of scores, while maintaining my desired parameters and correlations?
> 
> Please help...
> 
> Shane
> 
> Script follows...
> 
> 
> 
> #Number the subjects
> subject=1:1000
> #Assign a treatment condition from a binomial distribution with a probability of 0.13
> treat=rbinom(1*1000,1,.13)
> #Assign a lunch status condition froma binomial distribution with a probability of 0.35
> lunch=rbinom(1*1000,1,.35)
> #Generate age in months from a random normal distribution with mean of 87 and sd of 2
> age=rnorm(1000,87,2)
> #invoke the MASS package
> require(MASS)
> #Establish the covariance matrix for MAP, ITBS and CogAT scores
> sigma <- matrix(c(1, 0.84, 0.59, 0.84, 1, 0.56, 0.59, 0.56, 1), ncol = 3)
> #Establish MAP as a random normal variable with mean of 200 and sd of 9
> map   <- rnorm(1000, 200, 9)
> #Establish ITBS as a random normal variable with mean of 175 and sd of 15
> itbs <- rnorm(1000, 175, 15)
> #Establish CogAT as a random normal variable with mean of 100 and sd of 16
> cogat<-rnorm(1000,100,16)
> #Create a dataframe of MAP, ITBS, and CogAT
> data <- data.frame(map, itbs, cogat)
> #Draw from the multivariate distribution defined by MAP, ITBS, and CogAT means and the covariance matrix
> sim <- mvrnorm(1000, mu=mean(data), sigma, empirical=FALSE)
> #Set growth at 0
> growth=0
> #Combine elements into a single dataset
> simtest=data.frame (subject=subject, treat=treat,lunch, age=round(age,0),round(sim,0),growth)
> #Set mean growth by treatment condition with treatd subjects having a mean growth of 1.5 and non-treated having a mean growth of 0.1
> simtest<-transform(simtest, growth=rnorm(1000,m=ifelse(treat==0,0.1,1.5),s=1))
> simtest
> cor (simtest)
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/



More information about the R-help mailing list