[R] How to find out if two cells in a dataframe belong to the same pre-specified factor-level

trichter at uni-bremen.de trichter at uni-bremen.de
Sun Sep 27 23:22:13 CEST 2015


Dear list,
I really couldnt find a better way to describe my question, so please  
bear with me.

To illustrate my problem, i have a matrix with ecological distances  
(m1) and one with genetic distances (m2) for a number of biological  
species. I have merged both matrices and want to plot both distances  
versus each other, as illustrated in this example:

library(reshape)
library(ggplot2)
library(dplyr)

dist1 <- matrix(runif(16),4,4)
dist2 <- matrix(runif(16),4,4)
rownames(dist1) <- colnames(dist1) <- paste0("A",1:4)
rownames(dist2) <- colnames(dist2) <- paste0("A",1:4)

m1 <- melt(dist1)
m2 <- melt(dist2)

final <- full_join(m1,m2, by=c("Var1","Var2"))
ggplot(final, aes(value.x,value.y)) + geom_point()

Here is the twist:
The biological species belong to certain groups, which are given in  
the dataframe `species`, for example:

species <- data.frame(spcs=as.character(paste0("A",1:4)),
                       grps=as.factor(c(rep("cat",2),(rep("dog",2)))))

I want to check if a x,y pair in final (as in `final$Var1`,  
`final$Var2`) belongs to the same group of species (here "cat" or  
"dog"), and then want to color all groups specifically in the  
x,y-scatterplot.
Thus, i need an R translation for:

final$group <- If (final$Var1 and final$Var2) belong to the same group  
as specified
       in species, then assign the species group here, else do nothing  
or assign NA

so i can proceed with

ggplot(final, aes(value.x,value.y, col=group)) + geom_point()

So, in the example, the pairs A1-A1, A1-A2, A2-A1, A2-A2 should be  
identified as "both cats", hence should get the factor "cat".

Thank you very much!


Tim



More information about the R-help mailing list