[R] How to find out if two cells in a dataframe belong to the same pre-specified factor-level

Adams, Jean jvadams at usgs.gov
Mon Sep 28 20:15:09 CEST 2015


Here's one approach that works.  I made some changes to the code you
provided.  Full working example code given below.

library(reshape)
library(ggplot2)
library(dplyr)

dist1 <- matrix(runif(16), 4, 4)
dist2 <- matrix(runif(16), 4, 4)
rownames(dist1) <- colnames(dist1) <- paste0("A", 1:4)
rownames(dist2) <- colnames(dist2) <- paste0("A", 1:4)
m1 <- melt(dist1)
m2 <- melt(dist2)
# I changed the by= argument here
final <- full_join(m1, m2, by=c("X1", "X2"))

# I made some changes to keep spcs character and grps factor
species <- data.frame(spcs=paste0("A", 1:4),
  grps=as.factor(c(rep("cat", 2), (rep("dog", 2)))), stringsAsFactors=FALSE)

# define new variables for final indicating group membership
final$g1 <- species$grps[match(final$X1, species$spcs)]
final$g2 <- species$grps[match(final$X2, species$spcs)]
final$group <- as.factor(with(final, ifelse(g1==g2, as.character(g1),
"dif")))

# plot just the rows with matching groups
ggplot(final[final$group!="dif", ], aes(value.x, value.y, col=group)) +
  geom_point()
# plot all the rows
ggplot(final, aes(value.x, value.y, col=group)) + geom_point()

Jean


On Sun, Sep 27, 2015 at 4:22 PM, <trichter at uni-bremen.de> wrote:

> Dear list,
> I really couldnt find a better way to describe my question, so please bear
> with me.
>
> To illustrate my problem, i have a matrix with ecological distances (m1)
> and one with genetic distances (m2) for a number of biological species. I
> have merged both matrices and want to plot both distances versus each
> other, as illustrated in this example:
>
> library(reshape)
> library(ggplot2)
> library(dplyr)
>
> dist1 <- matrix(runif(16),4,4)
> dist2 <- matrix(runif(16),4,4)
> rownames(dist1) <- colnames(dist1) <- paste0("A",1:4)
> rownames(dist2) <- colnames(dist2) <- paste0("A",1:4)
>
> m1 <- melt(dist1)
> m2 <- melt(dist2)
>
> final <- full_join(m1,m2, by=c("Var1","Var2"))
> ggplot(final, aes(value.x,value.y)) + geom_point()
>
> Here is the twist:
> The biological species belong to certain groups, which are given in the
> dataframe `species`, for example:
>
> species <- data.frame(spcs=as.character(paste0("A",1:4)),
>                       grps=as.factor(c(rep("cat",2),(rep("dog",2)))))
>
> I want to check if a x,y pair in final (as in `final$Var1`, `final$Var2`)
> belongs to the same group of species (here "cat" or "dog"), and then want
> to color all groups specifically in the x,y-scatterplot.
> Thus, i need an R translation for:
>
> final$group <- If (final$Var1 and final$Var2) belong to the same group as
> specified
>       in species, then assign the species group here, else do nothing or
> assign NA
>
> so i can proceed with
>
> ggplot(final, aes(value.x,value.y, col=group)) + geom_point()
>
> So, in the example, the pairs A1-A1, A1-A2, A2-A1, A2-A2 should be
> identified as "both cats", hence should get the factor "cat".
>
> Thank you very much!
>
>
> Tim
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list