[R] Tricky vectorization problem

Mike Lawrence Mike.Lawrence at DAL.CA
Sun Oct 7 01:53:04 CEST 2007


Hi all,

I'm using the code below within a loop that I run thousands of times  
and even with the super-computing resources at my disposal this is  
just too slow. The snippet below takes about 10s on my machines,  
which is an order of magnitude or two slower than would be  
preferable; in the end I'd like to set the number of monte carlo  
experiments to 1e4 or even 1e5 to ensure stable results. I've tried  
my hand at vectorizing it myself but I'm finding it tricky. Any help  
would be greatly appreciated!

This code attempts to find the distribution of observed correlation  
values between A & B when each are measured imperfectly:

start = proc.time()[1] #start a timer

rho = .5 #set the true correllation

Ns = 100 #set the number of Subjects

a.No = 1:100 #for each subject, set a number of observations in A  
(1:100 chosen arbitrarily for demonstration)
b.No = 1:100 #for each subject, set a number of observations in B

v.a = 1 #set the between Ss variance in condition A
v.b = 2 #set the between Ss variance in condition B

s.a.w = 1:100 #for each subject, set a standard deviation in A (1:100  
chosen arbitrarily for demonstration)
s.b.w = 1:100 #for each subject, set a standard deviation in B

mce = 1e3 #set the number of monte carlo experiments

sim.r = vector(length=mce) #set up a collector for the simulated  
correlations

Sigma=matrix(c(v.a,sqrt(v.a*v.b)*rho,sqrt(v.a*v.b)*rho,v.b),2,2)  
#define a covariance matrix for use in generating the correlated data

a = vector(length=Ns) #set up a collector for subject means in A
b = vector(length=Ns) #set up a collector for subject means in B

for(i in 1:mce){ #start a monte carlo experiment
	
	sub.means=mvrnorm(Ns,rep(0,2),Sigma) #generate correlated ideal  
means for for each subject in each condition
	
	for(s in 1:Ns){ #for each subject
		
		a[s] = mean(rnorm(a.No[s],sub.means[s,1],s.a.w[s])) #generate some  
data for A and grab the mean
		b[s] = mean(rnorm(b.No[s],sub.means[s,2],s.b.w[s])) #generate some  
data for B and grab the mean
	
	}
	
	sim.r[i] = cor(a,b) #store the observed correlation between A and B

}

print(proc.time()[1]-start) #show the total time this took



--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://memetic.ca

Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
	- Piet Hein



More information about the R-help mailing list