[R] Re: Using 'by()' in a function

Setzer.Woodrow@epamail.epa.gov Setzer.Woodrow at epamail.epa.gov
Fri Apr 28 19:09:50 CEST 2000



Sorry, I should have said: I'm using R version 1.0.1 on Windows 98 (i.e.,
rw1001).

R. Woodrow Setzer, Jr.                                                    Phone:
(919) 541-0128
Biostatistics and
Fax:  (919) 541-4002
Research Support Staff
NHEERL MD-55; US EPA; RTP, NC 27711


|--------+----------------------->
|        |          Woodrow      |
|        |          Setzer       |
|        |                       |
|        |          04/28/2000   |
|        |          01:06 PM     |
|        |                       |
|--------+----------------------->
  >----------------------------------------------------------|
  |                                                          |
  |       To:     r-help list                                |
  |       cc:                                                |
  |       Subject:     Using 'by()' in a function            |
  >----------------------------------------------------------|



I have a list of dataframes, and want to apply a function to subsets of the rows
of each dataframe.  It seemed natural to write a function that takes a dataframe
as an argument, and uses 'by() within it to apply the function to the dataframe
subsets.  However, I cannot get it to work.  The problem seems to be passing the
data argument of by() as a function argument.  Is this bug, or am I missing
something (or both)?

> ### Generate some test data
> Test <- vector("list",2)
> Test[[1]] <-
data.frame(Dose=rep(c(0,1),c(10,10)),Resp1=rnorm(20),Resp2=rnorm(20))
> ### The summary function
> sumfun <- function(z)
+ {
+   by(data=z,
+      INDICES=list(factor(z[,"Dose"])),
+      FUN=function(y)
+      {
+        apply(as.matrix(y[,c("Resp1","Resp2")]),2,
+              function(x)c(Mean=mean(x),SD=sqrt(var(x))))
+      }
+      )
+ }
> ### Using by works by itself
> by(data=Test[[1]],
+    INDICES=list(factor(Test[[1]][,"Dose"])),
+    FUN=function(y)
+    {
+      apply(as.matrix(y[,c("Resp1","Resp2")]),2,
+            function(x)c(Mean=mean(x),SD=sqrt(var(x))))
+    }
+    )
: 0
          Resp1      Resp2
Mean -0.2426571 -0.1024979
SD    0.9203455  0.9988352
------------------------------------------------------------
: 1
          Resp1     Resp2
Mean -0.1632326 0.1079938
SD    1.4124645 0.8793081
> ### But not in a function
> sumfun(Test[[1]])
[1] "data.frame"
Error in nrow(z) : Object "z" not found
>

R. Woodrow Setzer, Jr.                                                    Phone:
(919) 541-0128
Biostatistics and
Fax:  (919) 541-4002
Research Support Staff
NHEERL MD-55; US EPA; RTP, NC 27711



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list