[R] cor/group by???

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Feb 1 22:18:22 CET 2006


bin999 at lycos.com writes:

> Can someone help me calculate correlations for grouped values?
> 
> Heres what my first few line of data look like:
> 
> > head(cmexpr)
>   LLID  GMID    CEXPR    MEXPR
> 1 1005 10831 2.057462 -0.08486
> 2 1005 10831 2.057515 -0.08486
> 3 1005 10831 2.057462  0.01209
> 4 1005 10831 2.057515  0.01209
> 5 1005 10836 2.050980  0.17237
> 6 1005 10836 2.018576  0.17237
> 
> LLID is gene id, GMID is cell line id, the EXPR columns are gene expression is two different microarray experiments.
> 
> I'd like to get correlations for each gene id (1005, 1006, etc)
> 
> Heres what I've tried so far:
> 
> sapply(by(cmexpr[3:4],cmexpr$LLID, function(x) cor(cmexpr$CEXPR,cmexpr$MEXPR, use = "pairwise.complete.obs")), function(x) x,simplify=T)
> 
> I think this is close to what I need, but its giving me the same corr for each gene, which is not right:
> 
>        1005       10083       10146       10158       10174       10206
> -0.04751543 -0.04751543 -0.04751543 -0.04751543 -0.04751543 -0.04751543
>       10211       10212       10219       10363       10484       10492
> -0.04751543 -0.04751543 -0.04751543 -0.04751543 -0.04751543 -0.04751543
> 
> Sorry if this sounds like a newbie question; it seems like it ought to be pretty straightforward, but I've wrestled with both the R documenation and mailing list archive and can't seem to get it right.

You're passing a function to by() whose result does not depend
on the argument x ...

This might be clearer:

c(by(cmexpr, cmexpr$LLID, with, cor(CEXPR, MEXPR, use="pair")))

or

sapply(split(cmexpr, cmexpr$LLID), with, cor(CEXPR, MEXPR, use="pair"))

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list