[R] by() subset by factor gives unexpected results

Mark Leeds markleeds2 at gmail.com
Sat Aug 5 15:09:20 CEST 2017


Putting answer here for future posterity. Didn't send to R-help initially
because I wasn't sure
what OP wanted. I guessed right.  Sorry for confusion in thread.


GUESSING THAT YOU WANT IS BELOW
#===================================================================

i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"),
stringsAsFactors = FALSE)
j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'),
stringsAsFactors = FALSE)

plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
points(i$x, i$y, col = i$B)
points(j$x, j$y, col = j$B)

On Sat, Aug 5, 2017 at 5:59 AM, Myles English <mylesenglish at gmail.com>
wrote:

>
> The answer was (thanks to Mark Leeds) to do with the use of a factor
> instead of a vector.
>
> on [2017-08-05] at 08:57 Myles English writes:
>
> > I am having trouble understanding how the 'by' function works.  Using
> > this bit of code:
> >
> > i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=c("red","blue","blue"))
> > j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=c('red','blue','green'))
>
> The use of I() prevents conversion to a factor:
>
> i <- data.frame(x=c(1,2,3), y=c(0,0,0), B=I(c("red","blue","blue")))
> j <- data.frame(x=c(1,2,3), y=c(1,1,1), B=I(c('red','blue','green')))
>
> > plot(0, 0, type="n", xlim=c(0,4), ylim=c(0,1))
> > by(i, i$B, function(s){ points(s$x, s$y, col=s$B) })
> > by(j, j$B, function(s){ points(s$x, s$y, col=s$B) })
> >
> > I would have expected the point at (1,1) to be coloured red.  When
> > plotted, this row is indeed red:
> >
> >> i[1,]
> >   x y   B
> > 1 1 0 red
> >
> > however, this next point is green on the plot even though I would like
> > it to be red:
> >
> >> j[1,]
> >   x y   B
> > 1 1 1 red
> >
> > How can I achieve that?
> >
> > Myles
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list