[R] Tricky vectorization problem

Mike Lawrence Mike.Lawrence at DAL.CA
Mon Oct 8 00:32:21 CEST 2007


Posting this for posterity and to demonstrate that a speed-up of 2  
orders of magnitude is indeed possible! I can now run 1e5 monte carlo  
experiments in the time the old code could only run 1e3. The final  
change was to replace my use of cor() with .Internal(cor()), which  
avoids some checks that are unnecessary in this case:


#start a timer
start = proc.time()[1]

#set the number of monte carlo experiments
mce = 1e5

#set the true correllation
rho = 1

#set the number of Subjects
Ns = 100

#for each subject, set a number of observations in A
a.No = 1:100
#for each subject, set a number of observations in B
b.No = 1:100

#set the between Ss variance in condition A
v.a = 1
#set the between Ss variance in condition B
v.b = 2

#for each subject, set a standard deviation in A
s.a.w = 1:100
#for each subject, set a standard deviation in B
s.b.w = 1:100

#set up a collector for the simulated correlations
sim.r = vector(length=mce)

#define a covariance matrix for use in generating the correlated data
Sigma=matrix(c(v.a,sqrt(v.a*v.b)*rho,sqrt(v.a*v.b)*rho,v.b),2,2)
eS <- eigen(Sigma, symmetric = TRUE, EISPACK = TRUE)
ev <- eS$values
fact <- eS$vectors %*% diag(sqrt(pmax(ev, 0)), 2)

#set up a collector for subject means in A
a = vector(length=Ns)
#set up a collector for subject means in B
b = vector(length=Ns)


#generate correlated ideal means for for each subject in each condition
X <- rnorm(2 * Ns * mce)
dim(X) <- c(2, Ns * mce)
sub.means <- t(fact %*% X)
	
a=sub.means[,1]+rnorm(Ns*mce,0,s.a.w/sqrt(a.No))
b=sub.means[,2]+rnorm(Ns*mce,0,s.b.w/sqrt(b.No))

for(i in 1:mce){
	end=i*Ns
	sim.r[i] = .Internal(
		cor(
			a[(end-Ns+1):end]
			,b[(end-Ns+1):end]
			,TRUE
			,FALSE
		)
	)
}

#show the total time this took
print(proc.time()[1]-start)



More information about the R-help mailing list