[R] Tricky vectorization problem

Mike Lawrence Mike.Lawrence at dal.ca
Sun Oct 7 18:20:11 CEST 2007


Had a realization this morning that I think achieves ceiling  
performance and thought I'd share with the list. I was previously  
generating sample data per participant and then calculating a mean,  
but of course this could be simplified by simply getting a single  
value from the sampling distribution of the mean for that  
participant. This speeds things up immensely (from 10.5 seconds to .5  
seconds) and should be sufficient for my needs, but I'd still welcome  
further improvement suggestions.


#start a timer
start = proc.time()[1]

#set the number of monte carlo experiments
mce = 1e3

#set the true correllation
rho = .5

#set the number of Subjects
Ns = 100

#for each subject, set a number of observations in A
a.No = 1:100
#for each subject, set a number of observations in B
b.No = 1:100

#set the between Ss variance in condition A
v.a = 1
#set the between Ss variance in condition B
v.b = 2

#for each subject, set a standard deviation in A
s.a.w = 1:100
#for each subject, set a standard deviation in B
s.b.w = 1:100

#set up a collector for the simulated correlations
sim.r = vector(length=mce)

#define a covariance matrix for use in generating the correlated data
Sigma=matrix(c(v.a,sqrt(v.a*v.b)*rho,sqrt(v.a*v.b)*rho,v.b),2,2)
eS <- eigen(Sigma, symmetric = TRUE, EISPACK = TRUE)
ev <- eS$values
fact <- eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2)

#set up a collector for subject means in A
a = vector(length=Ns)
#set up a collector for subject means in B
b = vector(length=Ns)


#generate correlated ideal means for for each subject in each condition
X <- rnorm(2 * Ns * mce)
dim(X) <- c(2, Ns * mce)
sub.means <- t(fact %*% X)

#generate observed means for each Subject in A
a=sub.means[,1]+rnorm(Ns*mce,0,s.a.w/sqrt(a.No))

#generate observed means for each Subject in B
b=sub.means[,2]+rnorm(Ns*mce,0,s.b.w/sqrt(b.No))

#Get the observed correlation for each monte carlo experiment
for(i in 1:mce){
	sim.r[i] = cor(
		a[(i*Ns-Ns+1):(i*Ns)]
		,b[(i*Ns-Ns+1):(i*Ns)]
	)
}

#show the total time this took
print(proc.time()[1]-start)



More information about the R-help mailing list