[R] getting the results of tapply into a single matrix

arun smartpink111 at yahoo.com
Fri Nov 15 01:21:25 CET 2013


Hi,
Try:
example <- read.table(text="ID Sex Location CL
1   F    lake1   40
1   F    lake1    
1   F    lake1     43
2   M    lake1    30
3   M    lake2    22
4   F    lake2     25
4   F    lake2     27",sep="",header=TRUE,stringsAsFactors=FALSE,fill=TRUE) 


aggregate(CL~.,example,mean,na.rm=TRUE)

#or
library(plyr)
ddply(example,.(ID,Sex,Location),summarize,CL=mean(CL,na.rm=TRUE))


#or from `result`
 res <- setNames(cbind(expand.grid(dimnames(result),stringsAsFactors=FALSE),as.data.frame(as.matrix(result))),c("Location","Sex","ID","CL"))
 res[!is.na(res$CL),c(3,2,1,4)]

###It would be better to store the results in a data.frame than in a matrix for this case.

A.K.



I have a table with three categorical columns (ID, Sex, Location) and a 
measurement column (CL). These are measurements from individuals in a 
study. Some individuals were measured more than once and thus have 
multiple rows. For those individuals, I need to take an average of all 
of their measurements so that I can run statistical tests without having pseudoreplication. Below is an example table with the code that I am 
using, the results that I am getting, and the results that I want (I am 
calling the table "example"). 
  
ID Sex Location CL 
1   F    lake1     40 
1   F    lake1     
1   F    lake1     43 
2   M    lake1    30 
3   M    lake2    22 
4   F    lake2     25 
4   F    lake2     27 

> result <- with(example, tapply(CL, list(Location, Sex, ID), mean, na.rm=T)) 

this almost does what I want. I takes the mean for each ID, and 
retains its relationship to the categorical variables, the problem is 
the output looks like this: 

, , 1 

             F M 
lake1 41.66667   
lake2           

, , 2 

      F        M 
lake1   30.00000 
lake2           

, , 3 

      F        M 
lake1           
lake2   22.00000 

, , 4 

             F M 
lake1           
lake2 26.00000   

How do I get those results back into a single table like I 
originally had. In other words, I want a table that looks like this:   

ID Sex Location CL 
1   F    lake1     41.66667 
2   M    lake1    30 
3   M    lake2    22 
4   F    lake2     26.00000 
  

Thanks for the help!



More information about the R-help mailing list