[R] loop testing unidentified columns

Brittany Demmitt demmitba at gmail.com
Mon Jun 20 19:14:56 CEST 2016


Hello,

I want to compare all of the columns of one data frame to another to see if any of the columns are equivalent to one another. The first column in both of my data frames are the sample IDs and do not need to be compared. Below is an example of the loop I am using to compare the two data frames that counts the number of equivalent values there between two columns. So in this example the value of 3 means that all three observations for the two columns being compared were equivalent. The loop works fine but I do not understand why it tests the first column of the sample IDs providing “NA” for the sum of matching when my loop is specifying to only test columns 2-3.  

Thank you!


#create dataframe A 
A = matrix(c("a",3,4,"b",5,7,"c",3,7),nrow=3, ncol=3,byrow = TRUE)    
A <- as.data.frame(A)
A$V2 <- as.numeric(A$V2)
A$V3 <- as.numeric(A$V3)
str(A)

#create dataframe B
B = matrix(c("a",1,1,"b",6,2,"c",2,2),nrow=3, ncol=3,byrow = TRUE)    
B <- as.data.frame(B)
B$V2 <- as.numeric(B$V2)
B$V3 <- as.numeric(B$V3)
str(B)

results.2 <- numeric()
results.3  <- numeric()


#compare columns to identify those that are identical in the two dataframes 
for(i in 2:3){
  results.2[i] <- sum(A[,2]==B[,i])
  results.3[i] <- sum(A[,3]==B[,i])
  results.pc.all <- rbind(results.2,results.3)
}
results.pc.all



More information about the R-help mailing list