[R] Nested foreach loops in R repeating items

arun smartpink111 at yahoo.com
Thu Feb 6 00:26:40 CET 2014


Hi,
Try ?duplicated()
 apply(x,2,function(x) {x[duplicated(x)]<-"";x})
A.K.



Hi all, 

I have a dataset of around a thousand column and a few thousands
 of rows. I'm trying to get all the possible combinations (without 
repetition) of the data columns and process them in parallel. Here's a 
simplification of what my data and my code looks like: 

mydata <- structure(list(col1 = c(231L, 8946L, 534L), col2 = c(123L, 2361L, 
65L), col3 = c(5645L, 45L, 51L), col4 = c(654L, 356L, 32L), col5 = c(21L, 
1L, 51L), col6 = c(4L, 4515L, 15L), col7 = c(6L, 1L, 535L), col8 = c(894L, 
20L, 35L), col9 = c(68L, 21L, 123L), col10 = c(46L, 2L, 2L)), .Names = c("col1", 
"col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9", 
"col10"), class = "data.frame", row.names = c(NA, -3L)) 

require(foreach) 

x <- 
foreach(m=1:5, .combine='cbind') %:% 
foreach(j=(m+1):10, .combine='c') %do% { 
paste(colnames(mydata)[m], colnames(mydata)[j]) 

} 

x 



if you execute the command above in R, you will get this result. 



      result.1     result.2     result.3     result.4     result.5     
 [1,] "col1 col2"  "col2 col3"  "col3 col4"  "col4 col5"  "col5 col6" 
 [2,] "col1 col3"  "col2 col4"  "col3 col5"  "col4 col6"  "col5 col7" 
 [3,] "col1 col4"  "col2 col5"  "col3 col6"  "col4 col7"  "col5 col8" 
 [4,] "col1 col5"  "col2 col6"  "col3 col7"  "col4 col8"  "col5 col9" 
 [5,] "col1 col6"  "col2 col7"  "col3 col8"  "col4 col9"  "col5 col10" 
 [6,] "col1 col7"  "col2 col8"  "col3 col9"  "col4 col10" "col5 col6" 
 [7,] "col1 col8"  "col2 col9"  "col3 col10" "col4 col5"  "col5 col7" 
 [8,] "col1 col9"  "col2 col10" "col3 col4"  "col4 col6"  "col5 col8" 
 [9,] "col1 col10" "col2 col3"  "col3 col5"  "col4 col7"  "col5 col9" 

notice that first problem I face that in the last row of the 
second column of the  "x" matrix says "col2 col3" which is a repetition 
of the first item (which happens also in all succeeding columns). I was 
planning to have unique combinations of all columns, which obviously, 
did not work. 

Can somebody please help me with this? My desired output would be 



      result.1     result.2     result.3     result.4     result.5     
 [1,] "col1 col2"  "col2 col3"  "col3 col4"  "col4 col5"  "col5 col6" 
 [2,] "col1 col3"  "col2 col4"  "col3 col5"  "col4 col6"  "col5 col7" 
 [3,] "col1 col4"  "col2 col5"  "col3 col6"  "col4 col7"  "col5 col8" 
 [4,] "col1 col5"  "col2 col6"  "col3 col7"  "col4 col8"  "col5 col9" 
 [5,] "col1 col6"  "col2 col7"  "col3 col8"  "col4 col9"   
 [6,] "col1 col7"  "col2 col8"  "col3 col9"   
 [7,] "col1 col8"  "col2 col9"   
 [8,] "col1 col9"  "col2 col10" 
 [9,] "col1 col10" 


Many thanks



More information about the R-help mailing list