[R] Help with speed (replacing the loop?)

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Wed Jan 11 15:57:39 CET 2012


Dear R-ers,

I have a loop below that loops through my numeric variables in data
frame x and through levels of the factor "group" and multiplies (group
by group) the values of numeric variables in x by the corresponding
group-specific values from data frame y. In reality, my:
dim(x) is 300,000 rows by 100 variables, and
dim(y) is 120 levels of "group" by 100 variables.
So, my huge data frame x takes up a lot of space in memory. This is
why I am actually replacing values of "a" and "b" in x with newly
calculated values, rather than adding them.
The code does what I need, but it takes forever.

Is there maybe a more speedy way to achieve what I need?
Thanks a lot!
Dimitri


# Example data:
x<-data.frame(group=c(rep("group1",5),rep("group2",5)),
a=1:10,b=seq(10,100,by=10))
x$group<-as.factor(x$group)
y<-data.frame(group=c("group1","group2"),a=c(10,20),b=c(2,3))
y$group<-as.factor(y$group)
(x);(y)

# My code:
myvars<-c("a","b")
for(var in myvars){
	for(group in levels(y$group)){
	  temp<-x[x$group %in% group,var]
	  temp<-temp * y[y$group %in% group,var]
	  x[x$group %in% group,var]<-temp
	}
}
(x)
-- 
Dimitri Liakhovitski



More information about the R-help mailing list