[R] Grouping columns in a data frame based on the values of a column

John Kane jrkrideau at yahoo.ca
Sat Sep 16 18:24:20 CEST 2006


--- e.rapsomaniki at mail.cryst.bbk.ac.uk wrote:

> Dear R users,
> 
> This is a trivial question, there might even be an R
> function for it, but I have
> to do it many times and wonder if there is an
> efficient for it.
> 
> 
> Suppose we have a data frame like this:
> d <- data.frame(x=sample(seq(0.1:1, by=0.01),
> size=100, replace=TRUE),
> y=rnorm(100, 0.2, 0.6))
> 
> and want to have the average of y for a given
> interval of x, for example
> mean(y)[0>x>0.1]. Is there a simple way of doing
> this or I need to improvise?

I don't think so.  I don't think there is any value of
x < 0.1 in the dataframe.

However if we change the data.frame to read
d <- data.frame(x=sample(seq(0.01:1, by=0.01),
size=100, replace=TRUE),
y=rnorm(100, 0.2, 0.6))

dd <- subset(d,  x> 0 & x < 0.1)
mean(dd[,2])

seems to work.
or if you do this a lot you might want to write it as
a funtion.

sub.mean <- function (frame, first.col, second.col,
upper, lower) {
dd <- subset(frame,  first.col > lower & first.col <
upper)
mean(frame[,2])
}

sub.mean(d, 1,2,0.1,0)



More information about the R-help mailing list