[R] Avoiding loops using 'for' and pairwise comparison of columns

Kulupp kulupp at online.de
Mon Jun 24 11:01:41 CEST 2013


Dear R-experts,

I'd like to avoid the use of very slow 'for'-loops but I don't know how. 
My data look as follows (the original data has 1600 rows and 30 columns):

# data example
c1 <- c(1,1,1,0.25,0,1,1,1,0,1)
c2 <- c(0,0,1,1,0,1,0,1,0.5,1)
c3 <- c(0,1,1,1,0,0.75,1,1,0.5,0)
x <- data.frame(c1,c2,c3)

I need to compare every column with each other and want to know the 
percentage of similar values for each column pair. To calculate the 
percentage of similar values I used the function 'agree' from the 
irr-package. I solved the problem with a loop that is very slow.

library(irr)     # required for the function 'agree'

# empty data frame for the results
a <- as.data.frame(matrix(data=NA, nrow=3, ncol=3))
colnames(a) <- colnames(x)
rownames(a) <- colnames(x)

# the loop to write the data
for (j in 1:ncol(x)){
   for (i in 1:ncol(x)){
     a[i,j] <- agree(cbind(x[,j], x[,i]))$value } }


I would be very pleased to receive your suggestions how to avoid the 
loop. Furthermore the resulting data frame could be displayed as a 
diagonal matrix without duplicates of each pairwise comparison, but I 
don't know how to solve this problem.

Kind regards

Thomas



More information about the R-help mailing list