[R] how to handle NA values in aggregate()

arun smartpink111 at yahoo.com
Sun Dec 16 07:22:42 CET 2012


HI,

This should also work:
df1<-read.table(text="
FID  MID    IID        EW_INCU EW_17.5  EMW        EEratio
1  4621  TWF2H5    45.26        NA            15.61        NA
1  4621  TWF2H6    48.02        44.09        13.41      0.3041506
2  4630  TWF2H19  51.44      47.81        NA            NA
2  4631  TWF2H21  NA          52.72        16.70      0.3167678
2  4632  TWF2H22  55.70      50.45        16.48      0.3266601
2  4633  TWF2H23  44.42      40.89        12.96      0.3169479
",sep="",header=TRUE,stringsAsFactors=FALSE)

aggregate(df1[,4:7],by=list(df1[,1]), mean,na.rm=T)
#  Group.1 EW_INCU EW_17.5  EMW EEratio
#1       1    46.6    44.1 14.5   0.304
#2       2    50.5    48.0 15.4   0.320

#or 
library(plyr)
ddply(df1,.(FID),colwise(mean,c("EW_INCU","EW_17.5","EMW","EEratio")),na.rm=TRUE)
#  FID EW_INCU EW_17.5  EMW EEratio
#1   1    46.6    44.1 14.5   0.304
#2   2    50.5    48.0 15.4   0.320

#or
library(data.table)
df2<-data.table(df1)
 df3<-df2[,c(1,4:7),with=FALSE]
 df3[,lapply(.SD,mean,na.rm=TRUE),by=FID]
#   FID EW_INCU EW_17.5  EMW EEratio
#1:   2    50.5    48.0 15.4   0.320
#2:   1    46.6    44.1 14.5   0.304

A.K.



----- Original Message -----
From: Yao He <yao.h.1988 at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Saturday, December 15, 2012 10:44 PM
Subject: [R] how to handle NA values in aggregate()

Dear All:

I am trying to calculate four columns' means in a dataframe like this:

FID  MID     IID         EW_INCU EW_17.5   EMW        EEratio
1   4621  TWF2H5    45.26        NA             15.61         NA
1   4621  TWF2H6    48.02        44.09         13.41      0.3041506
2   4630  TWF2H19   51.44       47.81         NA             NA
2   4631  TWF2H21   NA          52.72         16.70      0.3167678
2   4632  TWF2H22   55.70       50.45         16.48      0.3266601
2   4633  TWF2H23   44.42       40.89         12.96      0.3169479

I try this code

> aggregate(df[,4:7],df[,1],mean)

But I couldn't set the agrument na.rm=T in the mean() function,so the
results are all NAs

Please tell me how to handle NA values in the use of aggregate()

Thanks a lot

Yao He
—————————————————————————
Master candidate in 2rd year
Department of Animal genetics & breeding
Room 436,College of Animial Science&Technology,
China Agriculture University,Beijing,100193
E-mail: yao.h.1988 at gmail.com
——————————————————————————

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list