[R] a pickle with ranks and reals?

Thomas W Blackwell tblackw at umich.edu
Fri Aug 22 15:03:47 CEST 2003


John  -

Here are two equivalent solutions to your final question:

data <- data.frame(x=seq(15), y=sample(seq(15), 15),
	 	subj=sample(c("harry","steve","nathan","john"), 15, T))

result.1 <- unclass(by(data, data$subj, function(dd) cor(dd$x, dd$y)))

result.2 <- unclass(by(data, data$subj, function(dd) cor(dd[c(1,2)])[1,2]))

I guess I prefer  result.1  since the code is easier to read,
even though it does bury literal column names into the code.

The "function(dd)" stuff is a very common construction in  by(),
sapply(), lapply()  constructs.  It defines a little function
in-line, without ever naming it, and passes it as the third
argument to  by().  I use this all the time, when I need to
rearrange the order, or do a little bit of subscripting (as here),
in the arguments of a function (cor()) which I would otherwise
just pass directly as the third argument to  by().

I'll let others comment on my use of  unclass()  here.  The
goal was to get a numeric vector with a names attribute, so
it can be incorporated into further processing.  I'm surprised
just how much tinkering it took to get this all to work.

This might actually make a useful example to add to the help
page for  by().

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Fri, 22 Aug 2003, John Christie wrote:

> . . .  And, I also wanted to analyze correlations subject by subject and
> compare my two groups.  However, there doesn't seem to be a good way to
> get this.  I tried using "by" with "cor".  However, this requires
> binding x and y which causes cor to return a matrix (if you could pass
> it x and y separate it would just return a number).
>
> given
>
> data frame s
> x	y	subj
> 4	7	harry
> 5	1	harry
> 6	9	harry
> 2	4	steve
> 3	7	steve
> ...
>
> i'd like to be able to produce
>
> r	subj
> .12	harry
> .52	steve
> ...
>
> any tips?




More information about the R-help mailing list